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EVALUATION OF MPSRCH - A NEW TOOL FOR SEQUENCE SEARCHING. 

STIC is in the process of evaluating the MPSRCH software for conducting your sequence searches on our 
new MASPAR (massively parallel processing computer). This software was written by John Collins and 
S. Sturrock at Edinburgh University, U.K. and is distributed by Intelhgenetics, Inc. 

Some of the advantages of the MPSRCH /MASPAR combination over our currently used configuration 
of FASTDB/SUN workstations, are as follows: 

1 Speed- The MASPAR machine has 16,000 processors as compared to 8 on our fastest SUN 
server, and is therefore capable of millions of cell/updates/sec. For example, a protein search taking 30 
minutes on Sun was completed in 25 seconds on MASPAR. 

2 The ability to run the full Smith- Waterman optimization algorithm. This is especially important 
with DNA searches as we cannot currently justify doing this on Sun due to the substantially increased 
runtimes The DNA search results that you presently receive from Fastdb are for the initial score only, m 
most cases this alignment is satisfactory, however the optimization step offered in MPSRCH ensures the 
best ranking of the initial scores so that the "better" alignments may be located closer to the top of the 
alignment table. 

3 Complementary (inverse) DNA strand is automatically searched in MPSRCH. These results are 
denoted by a small letter "c" adjacent to the Result No. in the SUMMARIES table. 

4 Full support for all the IUPAC Nucleic Acid Codes Ambiguity codes when appropriate. 

MPSRCH/MASPAR format vs. FASTDB/SUN format 

1 . Lack of display context: 

At the present time there is no user selectable display context available, thus you cannot see 10-50 bases 
on either side of the aligned sequence, as is available in FASTDB/SUN format. 

2 Match % anomaly: (See Example 1 below). 

Look at Result No. 7 in the SUMMARIES table and in the alignment. 

Note * 

a- ' That in the SUMMARIES table the % Query Match is defined as the percentage of the query 
sequence matched (the score of the search, 18 in this example), calculated as a percentage of the perfect 
score found in the alignment (17 in this example) . It is determined as follows: 

1 The perfect score is calculated as the score of the query against itself ( 18 in this example) 

2 . The score of the query against database is determined (17 in this example) 

3 The % query match is calculated as: 

score * 100 in this example IT. * 100 = 94.4 % 

perfect score l ^ 

b The match % shown in the annotation is called the "Best Locally Similar Alignment " . This is 
defined as the point where no further improvement in the score can be obtained, even if the search is 
continued to the ends of the sequences. The aligned region shown thus represents the best possible 
continuation of the alignment. In this example, it is calculated thus: 

scpre * 100 in this example 17. * 100 = 100.0 % 

Matches 17 

The Examiners are cautioned to inspect both the % Query Match in the SUMMARIES table and Match 
% in the annotation in view of the above explanation, before using any alignment for rejection. Future 
versions of the MPSRCH software will incorporate both % match values in the annotation. 
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Release 2.0 John F. Collins & S. S. Sturrock, Biocomputing Research Unit. 
Copyright (c) 1993, 1994 by University of Edinburgh, U.K. 
Distribution rights by IntelliGenetics, Inc. 
MPsrch nn n.a. - n.a. database search, using Smith- Waterman algorithm 
Run on: Thu Feb 23 09:09: 16 1995; MasPar time 4.49 Seconds 

195. 126 Million cell updates/sec 

Tabular output not generated. 
Title: >US-08-XXX-XXX-X 
Description: (1:18) from US08XXXXXX.seq 

Perfect Score: 18 

N A Sequence: TTAGGGTTAGGGTTAGGG 18 
Comp: AATCCCAATCCCAATCCC 
Gap 60Nmatch STD : Dbase 0; Query 0 
Searched: 57621 seqs, 24347505 bases x 2 
Database: n-geneseq 

Statistics: Mean 3.876; Variance 2.020; scale 1.919 

Predicted No. is the number of results expected by chance to have a 
score greater than or equal to the score of the result being primed, 
and is derived by analysis of the total score distribution. 
SUMMARIES 



Description Pred. No. 



Result 




7o 

Query 




No. 


Score 


Match Length DB ID 


c 1 


18 


100.0 


24 10 Q52411 


c 2 


18 


100.0 


20 10 Q51150 


3 


18 


100.0 


60 10 Q51147 


4 


18 


100.0 


24 10 Q52413 


5 


18 


100.0 


36 11 Q63638 


6 


18 


100.0 


18 10 Q52410 


c 7 


17 


94.4 


163 1 Q066S4 



Human telomere length 3.40e-04 
Human telomeric singl 3.40e-04 
Human telomeric singl 3.40e-04 
Human CD4+ lymphocyte 3.40e-04 
Human herpes virus 7 3 . 40e-04 
Human telomere length 3.40e-04 
Feline T-cell lymphot 1.62e-03 



ALIGNMENTS 



RESULT 7 

ID Q06654 standard; DNA; 163 BP. AC Q06654; 
DT 26-FEB-1991 (first entry) 

DE Feline T-cell lymphotrophic lentivirus of clone 2BYCXL2. 

KW Feline T-cell lymphotropic lentivirus; FIV; 2BYCXL2: antibodies; 

KW vaccines; ds. 

OS Feline T-cell lymphotropic lentivirus 2428 (Pentaluma). 

FT CDS 2. .163 

FT /*tag= a 

FT /label = FIV 

PN WO9013573-A. 

PD 15-NOV-1990. 

PF 30-APR-1990; U02338. 

PR 08-MAY-1989; US-348784. 

PR 08-DEC-1989; US-447810. 

PA (IDEX-) IDEXX CORP. 

PI Anderson PR, Oconnor TP, Tonelli QJ; 

DR WPI; 90-361429/48. 

DR P-PSDB; R08085. 

PT Feline T-cell lympho-tropic lentivirus poly-peptide(s) - used for 
PT specific detection of FIV antibodies, prodn. of antibodies and in 
PT vaccines 

PS Disclosure; Fig 5(b); 37pp; English. 

CC See also Q06653-55 and R08094-96. 

SQ Sequence 163 BP; 28 A; 66 C; 37 G; 30 T; 

DB 1; Score 17; Match 100.0%; Predicted No. 1.62e-03; 
Matches 17; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Db 95 cctaaccctaaccctaa 1 1 1 

1 1 s 1 1 ! 1 1 1 E 1 1 1 1 1 ! I 

Cp 17 CCTAACCCTAACCCTAA 1 



4 



SERIAL NUMBER: DATE SEARCHED: 

CONFIDENTIAL - FOR INTERNAL PTO USE ONLY. 

EXAMINER INPUT REQUEST -EVALUATION OF SEQUENCE SEARCHES DONE ON THE 
M PSRCH/M ASP AR SYSTEM. 

Your sequence search request was run in parallel on the FASTDB/SUN system as well as the new 
MPSRCH/MASPAR configuration. STIC would really appreciate if you could take a few minutes of you 
time to provide the answers to the questions below. Your participation in this survey is completely 
optional . Please review the search results from the MPSRCH/MASPAR configuration in view of the 
FASTDB/SUN results and provide us feedback on the following questions: 

1 . What are your general impressions of the MPSRCH format? What did you like? What were your 
dislikes? 



2. Did MPSRCH/MASPAR find any especially relevant alignments that where missed by the 
FASTDB/SUN search? 



3. Did MPSRCH/MASPAR miss any especially relevant alignments that where found by 
FASTDB/SUN search? 



4. Did MPSRCH/MASPAR find any especially relevant alignments in the top ten positions that 
where much lower down in the FASTDB/SUN alignment table ? 



5. Did you find the problem with the MPSRCH/MASPAR % alignment ambiguity (as illustrated 
by Example 1 above), and the lack of display context to be a major hindrance in your understanding of 
the MPSRCH/MASPAR results? 



6. Any other comments? Thank You. 



PLEASE RETURN THIS FROM TO SEARCH INPUT TRAY IN THE 12TH FLOOR 
COMPUTER CLUSTER OR TO THE REFERENCE DESK/INPUT TRAY IN THE 
BIOTECH/CHEM LIBRARY. 
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Release 2.0 John F. Collins fc S. S. Sturroclu Biocomputing Research Unit. 
Copyright (c) 1993t 1994 by University of Edinburgh, U.K. 
Distribution rights by Intel I iGenet ics * Inc. 

MPsrch_pp protein - protein database searchr using Smith-Waterman algorithm 

Run on: Fri Mar 24 07*42:16 1995; MasPar tine 2.88 Seconds 

56-789 Million cell updates/sec 

Tabular output not generated. 



Title: 

Description: 
Perfect Score: 
Sequence: 

Scoring table: 



>US-OS-300-510-i 

(1:27) from US08300510.pep 

195 

1 KRDVDLFLTGTPDEYVEGVAQYKALPV 27 

PAM 150 
Gap 14 



Searched: 



50375 seqsi 6065180 residues 



Database : ^a-gerveseq. 



1 


a-genl 


2 


a-gen2 


3 


a-gen3 


4 


a-gen4 


5 


a-gen5 


6 


a-gen6 


7 


a-gen7 


8 


a~gen8 


9 


a-gen9 


10 


a-genlO 



Statistics: Mean 22.071; Variance 75,122; scale 0.294 

Predicted No. is the number of results expected by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

Result Query 

No. Score Match Length DB ID Description Pred. No. 



1 


195 


100.0 


88 


8 


R41984 


Human T cell 


reactive 


1. 


14e- 


13 


2 


195 


100.0 


92 


8 


R41983 


Human T cell 


reactive 


1. 


14e- 


13 


3 


195 


100.0 


27 


8 


R41975 


Human T cell 


reactive 


1. 


14e- 


13 


4 


195 


100.0 


96 


7 


R36548 


Recombitope 


YZX. 


1. 


14e- 


13 


5 


195 


100.0 


94 


3 


R12119 


TRFP chain 1 


with lea 


1. 


14e- 


13 


6 


195 


100.0 


92 


7 


R36539 


TRFP chain 1 


(with Le 


1. 


14e- 


13 


7 


195 


100.0 


88 


7 


R36540 


TRFP chain 1 


(with Le 


1. 


14e- 


13 


8 


195 


100.0 


27 


7 


R36542 


Peptide X. 




1. 


14e- 


13 


9 


195 


100.0 


96 


3 


R12120 


TRFP chain 1 


with lea 


1. 


14e~ 


13 






<i .n a 


O L 


q 








4 


4 A - _ 


i T 



4 1 
1 1 


1 Q*\ 

1 7U 


i on o 


94 


5 


R27367 


TRFP Chain #1 with Ci 


4 O 

12 


A7 

6/ 


7A A 
34 • *t 


4PP 

tCC 


1 0 




snaA gene product inv 


4 T 

13 


OJ 


77 7 


4S1 


9 

7 


R47872 


Enzyne/biocaialyst uh 


14 


LI 
OO 


TO 7 
j£ i O 


P14 
Cut 


4 

T 


R20746 


Hunan R-PTPase beta s 


15 


A7 


7.P 7 


4SP 


7 
w 


R131 1? 


Phenylalanine hydroxy 


lG 




7P 7. 

OC n O 


704 


1 


P80087 


Sequence of 85 kd pro 


17 


i 7 

63 


70 7 

JL « O 


COJ 


A 
*T 


R20743 


Hunan R-PTPase ganna 


4 o 

lo 


oc 


1 1 ft 

Ji iO 


SIS 
j j j 


1 


P90181 


Cross-reactive nateri 


17 


^.q 

J7 


7ft 7, 


IP, 9 


2 


P70668 


D-alanine r acer\as^ . 


OA 

cO 


JO 


P<? 7 


1 480 

1 T W W 


3 


R13300 


CFTR Y563N, 


O 1 


^7 

D / 


PQ P 
C 7 ■ C 


1 480 


3 


R13297 


CFTR S549R- 


22 


D/ 


OQ p 
C 7 • C 


JUU 


10 


R51059 


Sequence of plasnid p 


23 


D / 


P9 P 

C 7 « C 


1 190 


3 


R13308 


CFTR 3659 del C. 


24 


CC7 

5 / 


30 O 
CT * C 


1 dP.n 

1 TUVf 


7 
j 


R13234 


CFTR G178R. 


25 


3/ 


Op o 
C7 ■ C 


1 1 00 


1 
i 


P95644 


Rabbit seletal nuscle 


o £. 
CO 


7 


OQ p 
CT • C 


1 480 

li'Jv 


4 


R22492 


Cystic Fibrosis trans 


0*7 

27 


"%7 

D/ 


OO p 
CT « C 


1 480 


3 


R13235 


CFTR A455E. 


oo 

CO 


U / 


P9 P 
C7 • C 


1480 


3 


R 13894 


Cystic fibrosis trans 


oo 

CI 


R7 
D / 


po p 

C7 » C 


1480 


3 


R13299 


CFTR R560T. 


T A 

30 




pq p 

CT a C 


1 480 


3 


R13232 


CFTR G85E- 


"T 4 

31 




PQ P 
CT ■ C 


1 480 


P 


Rl 1 1 15 


Cystic fibrosis trans 


TO 


R7 

D / 


PQ P 

C7 * C 


1 479 

l T / 7 


2 


Rl 1602 


Mutant cystic fibrosi 


77 

33 


R7 


P9 P 

C7 a C 


1 480 

1 TO w 


3 


R 13302 


CFTR L1077P. 


34 


c-7 


PQ p 
CT • C 


1 480 


4 


R23074 


Cystic fibrosis gene 


3D 


cr 7 


PQ p 

C7 t C 


1 480 


3 


R13233 

1 • J* W bo* W W 


CFTR I148T. 


7 L 

3o 


D / 


OQ p 
t-7 « C 


1 479 

11/7 


3 


R13231 


CFTR delta 1507, 


"7*7 

37 


cr 7 

J / 


PQ P 
C7 • C 


1 091 


3 

-j 


R13303 

i \ i. w w W w 


CFTR Y1092X. 


3o 


D / 


PQ P 
C7 ■ C 


1 091 


7 


R33553 

1\ W W W w w 


Sequence of the alpha 


37 


c :7 


PQ p 


1 480 


3 


R13293 


CFTR G551D. 


>t A 


Do 


PO 7 
CO » / 


SO 8 


10 


R51696 


Hunan PDI. 


41 


DO 


OO 7 
CO - / 


77 


7 


818914 


Reconbitope ZXY. 


AO 

-tc 


□ O 


Pft 7 

CO a f 


491 

T 7 i. 


6 


R25296 


Recombinant PDI (Aspl 


fl 7 

43 


JO 


PP 7 

CO n / 


S08 

Juu 


6 


R25297 


PDI. 


44 


Do 


oo 7 
CO « / 


S1 s 

Jl J 


10 


R51697 


Hunan HSA-PDI fusion 


T ij 


56 


28.7 


508 


1 


P80664 


Polypeptide uith prot 



1.14e-13 

8.62e+00 

1.33e+01 

2.03e+01 

2.03e+01 

2.03e+01 

2.03e+01 

2.51e+01 

4,71e+01 

5.79e+01 

7.1ie+01 

7.11e+01 

7.11e+01 

7. lle+01 

7.11e+01 

7.11e+01 

7.11e+01 

7.11e+01 

7.11e+01 

7.11e+01 

7.11e+01 

7.Ue+01 

7. ile+01 

7 . 1 le+01 

7.1U+01 

7.11e+01 

7 . 1 le+01 

7.11e+01 

7.11e+01 

8.72e+01 

8.72e+01 

8.72e+01 

8.72e+01 

8.72e+01 

8,72e+01 



ALIGNMENTS 

RESULT 1 

ID R41984 standard; Protein; 88 AA. 

AC R41984? 

DT 21-APR-1994 (first entry) 

DE Hunan T cell reactive feline protein B chain 1. 

KU Hunan; T cell; reactive; feline; protein; immune response; antigen; 

KU tolerance; natwaW Dernatophagoides; FelisJ Ambrosia; Loliun; CanisS 

KM Cryptomeria; Alternaria; Alder; Betula; Quercus; Olea; Artenesia? 

KW Plantago; Parietaria! Blattella; Apis; Periplaneta; autoantigen. 

OS Homo sapiens. 

PH Key Location/Qualifiers 

FT Peptide 1*.17 

FT /note= "Signal peptide" 

FT Protein 18.. 38 

FT /note= "Mature protein" 

PN WQ9319178-A. 

PD 30-SEP-1993. 

PF 25-MAR-1993J U02462. 

PR 25-MAR-1992; US-857311. 

PR 15-MAY-1992; US-884718. 

PR 15-JAN-1993; US-006116. 

PA (IMMU-) IMHUHOLOGIC PHARM CORP. 

PI Briner TJ, Garman RD, Gefter ML, Greenstein JLx Kuo M; 

PI Horville M; 

DR WPI; 93-320744/40. 

DR N-PSDB; Q49534. 



PT epitope(s) of an allergen administered subcutaneously, for 

PT treating sensitivity to cats* beesr etc. 

PS Disclosure? Fig 1; 107pp? English. 

CC The sequences given in R41983-84 represent chain i of human T cell 

CC reactive feline proteins (TRFP) A and B respectively. Peptides 

CC derived from TRFP may be used in a therapeutic composition which is 

CC useful in treating diseases uhich involve an immune response to a 

CC protein antigen, This composition may be used to induce tolerance 

CC in a mammal to Dermatophagoides , Felis, Ambrosia, Lolium, Cryptomeria, 

CC Alternaria, Alder* Betula, Guercus, Glea, Artemesia, Plantagor 

CC Parietariar Canis, Blattella, Apis, Periplaneta and to autoantigens 

CC in humans. 

SQ Sequence 88 AA? 

DB 8; Score 195; Match 100. OX? Predicted No. 1.14e-13? 

Matches 27? Conservative 0? Mismatches 0! Indels 0; Gaps 0 

Db 25 krdvdlf Itgtpdeyveqvaqykalpv 51 
I MM i MMI 1 1 MM ! i I i I M ! M 

Qy 1 KRDVDLFLTGTPDEYVEQVAQYKALPV 27 



RESULT 2 

ID R41983 standard; Protein? 92 AA. 

AC R4I983? 

DT 21-APR-1994 (first entry) 

DE Human T cell reactive feline protein A chain 1. 

KM Human? T cell? reactive? feline? protein? immune response; antigen? 

KW tolerance? mammal; Dermatophagoides? FelisJ Ambrosia? Lolium? Canis; 

KW Cryptomeria? Alternaria? Alder; Betula? Guercus? Olea? Artemesia? 

KW Plantago? Parietaria? Blattella? Apis; Periplaneta? autoantigen. 

OS Homo sapiens. 

FH Key Location/Qualifiers 

FT Peptide 1.-22 

FT /note= "Signal peptide" 

FT Protein 23. .92 

FT /note= "Mature protein" 

PN W09319178-A. 

PD 30-SEP-1993. 

PF 25-MAR-1993? U02462, 

PR 25-MAR-1992? US-857311. 

PR 15-MAY-1992? US-884718. 

PR 15-JAN-1993; US-0G6116. 

PA (IMMU-) IMMUNOLOGIC PHARM CORP. 

PI Briner TJ, Garman RDf Gefter ML? Greenstein JL, Kuo M? 

PI Morville M? 

DR WPI? 93-320744/40. 

DR N-PSDB? 049533. 

PT Neu peptide (s) for inducing tolerance - comprise one or more 

PT epitope(s) of an allergen administered subcutaneously , for 

PT treating sensitivity to cats* bees, etc. 

PS Disclosure? Fig i? i07pp? English. 

CC The sequences given in R41983-84 represent chain 1 of human T cell 

CC reactive feline proteins (TRFP) A and B respectively. Peptides 

CC derived from TRFP may be used in a therapeutic composition uhich is 

CC useful in treating diseases uhich involve an immune response to a 

CC protein antigen. This composition may be used to induce tolerance 

CC in a mammal to Dermatophagoides, Felis, Ambrosia, Lolium, Cryptomeria, 

CC Alternaria, Alder, Betula, Quercus, Olea, Artemesia, Plantago, 

CC Parietaria, Canis, Blattella, Apis, Periplaneta and to autoantigens 

CC in humans . 

SQ Sequence 92 AA? 

DB 8? Score 195? Match 100.07.? Predicted No. 1.14e-13? 
Matches 27? Conservative 0? Mismatches 0? Indels 0; Gaps 



Db 29 krdvdlf Itgtpdeyveqvaqykalpv 55 

i M I M MM I M ! M M ! I II M 1 1 1 
Qu 1 KRDVDLFLTGTPDEYVEGVAGYKALPV 27 



KW 
KW 



RESULT 3 

ID R41975 standard; peptide? 27 AA. 

AC R41975; 

DT 21-APR-1994 (first entry) 

DE Hunan T cell reactive feline protein fragment X. 

Hunan; T cell; reactive; feline? protein; immune response? antigen; 
tolerance; mammal; Dermatophagoides ; Felis; Ambrosia; Lolium? Canisi 

KW Cryptomeria; Alternaria; Alder; Betula? Quercus; Olea; Artemesia; 

KW Plantago; Parietaria; Blattella; Apis; Periplaneta; autoantigen; ss. 

QS Homo sapiens. 

PN W09319178-A. 

PD 30-SEP-1993. 

PF 25-MAR-1993; U02462. 

PR 25-MAR-I992; US-857311. 

PR 15-HAY-1992; US-8S4718. 

PR 15-JAN-1993; US-006116. 

PA (IWMU-) IMMUNOLOGIC PHARM CORP. 

PI Briner TJ, Garnan RD, Gefter ML, Greenstein JL; 

PI Kuo M, Morville M; 

DR WPI; 93-320744/40. 

PT Neu peptide(s) for inducing tolerance - comprise one or more 

PT epitope(s) of an allergen administered subcutaneous ly , for 

PT treating sensitivity to cats* beesr etc. 

PS Claim 1; Fig 3; 107pp; English, 

cc T he sequences given in R41975-82 are peptides derived from a human T 

CC cell reactive feline protein. These peptides are used in a 

CC therapeutic composition which is useful in treating diseases which 
CC involve an immune response to a protein antigen. This composition 

CC may be used to induce tolerance in a mammal to Dermatophagoides , 

CC Felis, Ambrosia, Lolium, Cryptomeria, Alternaria, Alder, Betula, 

CC Quercus, Olea, Artemesia, Plantago, Parietaria, Canis, Blattella, 

CC Apis, Periplaneta and to autoantigens in humans. 

SQ Sequence 27 AA; 

DB 8; Score 195; Match 100.0'/.; Predicted No. 1.14e-13; 
Hatches 27; Conservative 0; Mismatches 0; Indels 0; Gaps 

Db 1 krdvdlf Itgtpdeyveqvaqykalpv 27 

i 



Qy 1 KRDVDLFLTGTPDEYVEQVAQYKALPV 27 



RESULT 4 

ID R36548 standard; Protein; 96 AA. 

AC R36548; 

DT 12-AUG-1993 (first entry) 

DE Recombitope YZX. 

KW Human T cell reactive feline protein; TRFP; epitope; recombitope 

KW sensitivity; Felis domesticus. 

OS Synthetic. 

FH Key Location/Qualifiers 

FT Cleavage_site 14.. 15 

FT /label= thrombin_cleavage_site 

PN W09308280-A. 

PD 29-APR-1993. 

PF U-0CT-1992J U08694. 

PR 16-0CT-1991J US-777859. 

PR 13-DEC-1991; US-807529. 

PA UHMU-) IMMUL0GIC PHARH CORP. 

PI Bond JF, Garman RD, Kuo M» Morgenstern JP, florville hi 



DR WPI; 93-152473/18. 

DR N-PSDB? Q41572. 

PT Recombitope peptide having T-cell stimulating activity - for the 

PT diagnosis and treatment of sensitivity to protein allergens , 

PT auto?antigens and protein antigens 

PS Disclosure? Fig 8; 73pp? English. 

CC Preferred recombitope peptides for treating sensitivity to Felis 

CC domesticus are derived from the the genus Felis and comprise 

CC regions selected fron peptides Xr Yr Zr A and Bi of TRFP* and 

CC modifications thereofr such as peptide C. 

CC Oligonucleotides C? Dr E, F, G, H and I are used in the 

CC construction of recombitope peptide YZX. 

SQ Sequence 96 AA? 

DB 7; Score 195? Match 100.07.? Predicted No. 1.14e-13? 

Hatches 27? Conservative 0? Mismatches 0? Indels 0? Gaps 0; 

Db 70 krdvdlf Itgtpdeyveqvaqykalpv 96 

Qy 1 KRDVDLFLTGTPDEW 27 



RESULT 5 

ID R12119 standard? Protein? 94 AA. 

AC R12U9? 

DT 26-JUL-1991 (first entry) 

DE TRFP chain 1 with leader A. 

KW Hunan T cell reactive feline protein? cat allergens. 

OS Felis catus. 

FH Key Location/Qualifiers 

FT Peptide 3.. 24 

FT /label= Leader B 

FT Protein 25.-94 

FT /label= TRFP Chain 1 

PN W09106571-A. 

PD 16-MAY-1991. 

PF 02-N0V-i990? U06548. 

PR 03-N0V-1939? US-431565. 

PA (INHU-) IMMULOGIC PHARM COR. 

PI Gefter ML* Carman RDr Greenstein JL? Juo M, Rogers BL? 

PI Brauer AW? 

DR WPI ? 91-164136/22, 

DR N-PSDB? G11836. 

PT New pure covalently linked human T cell reactive feline protein - 

PT and modified peptideCs)* used to reduce effects of cat allergens 

PT and to diagnose sensitivity to allergens. 

PS Claim 2? Fig 1? 70pp? English. 

CC Poly-A mRNA from cat parotid and mandibular glands uas used to 

CC produce cDNA clones for both chain 1 and chain 2 of TRFP. These 

CC clones were then used to screen a cat genomic library. Chain 1 

CC exists in two forms having different leader sequences (A and B). 

CC The sequence can be used to express the protein and peptide derivs. 

CC which stimulate T-cells in persons allergic to cats. The peptides 

CC can be used to reduce/eliminate the allergic response partic. by 

CC modificn. of lynphokine prodn. by the T-cells. They can also be 

CC used to identify epitopes responsible for sensitivity. The DNA can 

CC be used to detect comparable sequence in other species r and also 

CC for prodn. of modified forms of TRFP esp. showing reduced binding 

CC to IgE and thus reduced tendency to cause adverse reactions. 

CC See also R12120-R12123. 

SQ Sequence 94 AA? 

DB 3? Score 195? Match 100.07.? Predicted No. i.l4e-13? 

Matches 27? Conservative 0? Mismatches 0? Indels 0? Gaps 0? 



RESULT 6 

ID R36539 standard; Protein; 92 AA. 

AC R36539; 

DT 12-AUG-1993 (first entry) 

DE TRFP chain 1 (with Leader A). 

KW Hunan T cell reactive feline protein! TRFP; leader A; leader B; 

KW epitope. 

OS Felis. 

FH Key Location/Qualifiers 

FT Peptide 1..22 

FT /label= leader_peptide 

PN W09308280-A. 

PD 29-APR-1993. 

PF 16-0CT-1992; U08694. 

PR 16-0CT-1991J US-777859. 

PR 13-DEC-1991! US-807529. 

PA (IHMU-) IMMULOGIC PHARM CORP. 

PI Bond JF, Gar nan RD, Kuo H. Morgenstern JPr ilorville M; 

PI Rogers BL; 

DR WPI; 93-152473/18. 

DR N-PSDB; Q41556. . 

PT Reconbitope peptide having T-cell stimulating activity - for the 

PT diagnosis and treatment of sensitivity to protein allergens, 

PT autoiantigens and protein antigens 

PS Disclosure; Fig 1; 73pp; English. 

CC Chains 1 and 2 of the TRFP have been recornbinantly expressed in E. 

rc coli and purified. T cell epitope studies using overlapping peptide 

CC regions derived from the TRFP anino acids sequence were used to 

CC identify multiple T cell epitopes in each chair, of TRFP. 

SQ Sequence 92 AA; 

DB 7! Score 195; Match 100. OX; Predicted No. 1.14e-13; 
Matches 27; Conservative 0; Mismatches 0; Indels Of Gaps 

Db 29 krdvblf Itgtpdeyveqvaqykalpv 55 

1 1 I ! 1 1 I 1 1 ! I I 1 1 1 1 I ! I 1 1 I 1 1 1 1 I _ 

Qy 1 KRDVDLFLTGTPDEYVEGVAQYKALPV 27 



RESULT 7 

ID R36540 standard; Protein; 88 AA. 

AC R36540; 

DT 12-AUG-1993 (first entry) 

DE TRFP chain 1 (with Leader B) . 

KW Human T cell reactive feline protein; TRFP ; leader A; leader B! 

KW epitope. 

OS Felis. 

FH Key Location/Qualifiers 

FT Peptide 1.-18 

FT /label= leader_peptide 

PN WQ9308280-A. 

PD 29-APR-1993. 

PF 16-0CT-1992? U08694. 

PR 16-0CT-1991; US-777859. 

PR 13-DEC-1991J US-807529. 

PA (IMMU-) IMMULOGIC PHARM CORP. 

PI Bond JF, Garinan RD, Kuo H. Morgenstern JP, Morville M? 

PI Rogers BL; 

DR WPU 93-152473/18. 

DR N-PSDB; Q41557. 

PT Recombitope peptide having T-cell stimulating activity - for the 

* * • ; « ■ . r> in -i * - i - - 1 1 " 



PT auto? antigens and protein antigens 

PS Disclosure; Fig li 73pp; English. _ 

re Chains 1 and 2 of the TRFP have been reconbinantly expressed xn E. 

CC coli and purified. T cell epitope studies using overlapping peptide 

CC regions deri^6 from the TRFP amino acids sequence were used to 

CC identify multiple T cell epitopes in each chain of TRFP. 

SQ Sequence 88 AA; 

DB 7; Score 195; Match 100.0%? Predicted No. 1 . 14e-13; 
Hatches 27? Conservative 0; Mismatches 0; Indels Of Gaps 

Db 25 krdvdlf Itgtpdeyveqvaqykalpv 51 

Qy 1 KRDVDLFL^ 27 



RESULT 8 

ID R36542 standard; Protein? 27 AA. 

AC R36542? 

DT 12-AUG-1993 (first entry) 

DE Peptide X. t ... 

KW Hunan T cell reactive feline protein; TRFPi epitope; reconbitope. 

OS Felis. 

FN W0930S2B0-A, 

PD 29-APR-1993. 

PF 16-0CT-1992; U0S694. 

PR 16-0CT-1991; US-777859, 

PR 13-DEC-1991J US-807529, 

PA (IMMU-) IHMULOGIC PHARM CORP. 

PI Bond JF, Garman RDr Kuo Mi Morgenstern JP, Morville MJ 

PI Rogers BLJ 

DR WPI; 93-152473/18. 

PT Rpcombitope peptide having T-cell stimulating activity - for the 

PT diagnosis and treatment of sensitivity to protein allergens, 

PT auto?antigens and protein antigens 

PS Disclosure; Fig 4; 73pp; English. 

CC rhains i and 2 of the TRFP have been reconbinantly expressed m t. 

CC coli and purified. T cell epitope studies using overlapping peptide 

CC regions derived from the TRFP amino acids sequence uere used to 

CC identify multiple T cell epitopes in each chain of TRFP. DNA 

re constructs uere assembled in which 3 regions (encoding peptides X, 

CC Y and Z) uere linked to produce DNA constructs encoding reconbitope- 

CC peptides. 

SQ Sequence 27 AA; 

DB 7; Scorp 195; Match 100. 01 S Predicted No. 1.14e-13; 
Matches 27; Conservative 0; Mismatches 0; Indels 0; Gaps 

Db 1 krdvdlf Itgtpdeyveqvaqykalpv 27 

lilllililillimmilMMM 

Qy 1 KRDVDLFLTGTPDEYVEGVAGYKALPV 27 



RESULT 9 

ID R12120 standard; Protein; 96 AA. 

AC R12120; 

DT 26-JUL-I991 (first entry) 

DE TRFP chain I with leader B. 

KW Hunan T cell reactive feline protein; cat allergens. 

OS Felis catus. 

PH Key Location/Qualifiers 

FT Peptide 9.. 26 

FT /label= Leader B 

FT Protein 27.. 96 

FT /label= TRFP Chain 1 

^ rip o< r 



PD 16-MAY-1991. 

PF 02-N0V-1990; U06548. 

PR 03-N0V-1989; US-431565. 

PA (IMMU-) IMMULOGIC PHARM COR. 

PI Gefter ML r Garman RD, Greenstein JL, Juo M, Rogers BL; 

PI Brauer AW J 

DR WPI J 91-164136/22. 

DR N-PSDBJ Q11837. . 

PT New pure covalently linked human T cell reactive feline protein - 

PT and modified peptide (s). used to reduce effects of cat allergens 

PT and to diagnose sensitivity to allergens. 

PS Claim 2; Fig 1; 70ppJ English. 

CC Poly-A nRNA from cat parotid and mandibular glands was used to 

CC produce cDNA clones for both chain 1 and chain 2 of TRFP. These 

CC clones were then used to screen a cat genomic library. Chain 1 

CC exists in- two forms having different leader sequences (A and B) . 

CC The sequence can be used to express the protein and peptide derivs. 

CC which stimulate T-cells in persons allergic to cats. The peptides 

rr can b* used to reduce/eliminate the allergic response partic. by 

CC modificn. of lynphokine prodn. by the T-cells. They can also be 

CC used to identify epitopes responsible for sensitivity. The DNA can 

CC be used to detect comparable sequence in other species, and also 

CC for prodn. of modified forms of TRFP esp. showing reduced binding 

CC to IgE and thus reduced tendency to cause adverse reactions. 

CC See also R12119-R12123. 

SQ Sequence 96 AA; 

DB 3; Score 195? Match 100.0%; Predicted No. 1.14e-13; 

(latches 27; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Db 33 krdvdlf Itgtpdeyveqvaqykalpv 59 
Qy 1 KRDVDLFLTGTPDEYVE8VAGYKALPV 27 



RESULT 10 

ID R27368 standard; protein; 96 AA. 

AC R27368; 

DT 25-FEB-1993 (first entry) 

DE TRFP Chain #1 with CI leader B sequence. 

KM T cell reactive feline protein; cat allergy; allergic; IgE; 

KU desensitizing; 

OS Felis domesticus. 

FH Key Location/Qualifiers 

FT Peptide 1..27 

FT /label= Leader B 

FT Protein 28.. 96 

FT /label= TRFP chain #1 

PN W09215613-A. 

PD 17-SEP-1992. 

PF 20-FEB-1992; U01344. 

PR 28-FEB-1991; US-662193. 

PA (IMMU-) IMMULOGIC PHARM CORP. 

PI Bond Jr Kuo M; 

DR WPI; 92-331670/40. 

PT Modified human T-cell reactive feline protein - stimulates T-cell 

PT in individuals allergic to cats and shows reduced 

PT histamine-releasing properties 

PS Claim U Fig 1; 35pp? English. 

CC This sequence represents a modified human T-cell reactive feline 

CC protein which stimulates T-cells from an individual who is allergic 

CC to cats, but which interacts with human IgE to a lesser extent than 

rr does affinity purified TRFP. The protein is modified by treating 

CC with either a mild alkali (pH 12.5-13.5 , K0H, NaOH, LiOH or tertiary 

CC amines) or an enzyme which removes 0-1 inked groups (carbohydrate 



SQ Sequence 96 AA? 



DB 5J Score 195; hatch 100,07.? Predicted No. 1.14e-13? 
Matches 27; Conservative 0? Mismatches 0; Indels 0; Gaps 

Db 33 krdvdlf Itgtpdeyveqvaqykalpv 59 

1 1 1 1 1 i I M i ! 1 1 1 1 1 1 1 1 1 M 1 1 i M 
Qu 1 KRDVDLFLTGTPDEYVEQVAQYKALPV 27 



RESULT 11 

ID R27367 standard; protein? 94 AA. 

AC R27367? 

DT 25-FEB-1993 (first entry) 

DE TRFP Chain #1 with Ci leader A sequence. 

KW T cell reactiwe feline protein. 

OS Felis domesticus. 

PH Key Location/Qualifiers 

FT Peptide 1..25 

FT /labels Leader A 

FT Protein 25,, 94 

FT /label= TRFP chain #1 

PN WG9215613-A. 

PD 17-SEP-1992. 

PF 20-FEB-1992? U01344. 

PR 28-FEB-1991? US-662193. 

PA (IMMU-) IMMULOGIC PHARM CORP. 

PI Bond J i Kuo M? 

DR HPI; 92-331670/40. 

PT Modified human T-cell reactive feline protein - stimulates T-cell 

PT in individuals allergic to cats and shows reduced 

PT histamine-releasing properties 

PS Claim l; Fig 1? 35pp? English. 

CC This sequence represents a modified human T-cell reactive feline 

CC protein which stimulates T-cells from an individual who is allergic 

CC to cats, but which interacts with human IgE to a lesser extent than 

rc does affinity purified TRFP. The protein is modified by treating 

CC with either a mild alkali <pH 12.5-13.5 , KOH, NaOH, LiOH or tertiary 

CC amines) or an enzyme which removes 0-1 inked groups (carbohydrate 

CC moieties). It is useful in desensitising people who are allergic to cats. 

SQ Sequence 94 AA? 

DB 5? Score 195; Match 100.0'/.; Predicted No. 1.14e-13? 

Matches 27; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Db 31 krdvdlf Itgtpdeyveqvaqykalpv 57 

Qy i KRDVDLFLTGTPDEYVEGVAGYKALPV 27 



RESULT 12 

ID R54202 standard? Protein? 422 AA. 

AC R54202? 

DT 18-N0V-1994 (first entry) 

DE snaA gene product involved in streptogramin biosynthetic pathway. 

KW Antibiotic? streptogramin? snaA? snaB; snaC? biosynthesis? enzyme? 

KW biosynthetic pathway? Streptomyces pristinaespiral i s . 

OS Streptomyces pristinaespiral is . 

PN FR2696189-A. 

PD 01-APR-1994. 

PF 25-SEP-1992? 011441. 

PR 25-SEP-1992; FR-011441. 

PA (RHON ) RHONE POULENC RORER SA. 

PI Blanc V, Blanche Fr Crouzet Jr Jacques N, Lacroix P? 

PI Thibaut D> Zagorec M? 



DR N-PSDB; Q64202. 

PT DNA involved in streptogramin antibiotic biosynthesis - for 

PT prodn. or bio-conversion of streptogranin(s) or prodn. of 

PT streptogranin intermediates, derivs. or hybrid antibiotics 

PS Claim 21; Page 49-51? 83pp; French. 

CC Th* snaA gpne product is involved in the biosynthesis of 

TC streptogranins, antibiotics active against Gran-positive bacteria, 

CC The identification of the sequences encoding the enzymes involved 

CC in the biosynthetic pathway neans that they can be isolated and 

CC manipulated, Mutant microorganisms in uhich a step in the 

CC streptogranin biosynthetic pathuay is blocked can be cultured to 

re produce streptogranin intermediates r which may later be converted 

CC to streptogranin derivatives. Recombinant cells may also be used 

CC for the byconversion of streptogramins from one form to another or 

CC for the production of hybrid antibiotics. 

SQ Sequence 422 AA; 

DB 101 Score 67; Match 33.37.; Predicted No. S,62e+00i 

Matches 6? Conservative 10; Mismatches l; Indels i; Gaps 

Db 370 nidfpylpgsaddfvdhv 387 

;;|: ;| MM-PM 
Qy 3 DVDL-FLTGTPDEYVEQV 19 



RESULT 13 

ID R47872 standard; Protein; 453 AA. 

AC R47872; 

0T 02-AUG-1994 (first entry) 

DE Enzyme/biocatalyst which desulphurises a fossil fuel. 

KW Enzyme; biocatalyst? fossil fuel; oxidation; cleavage; 

KW organosulphur compounds J coal. 

OS Rhodococcus rhodochrous. 

PN W09401563-A. 

PD 20-JAN-1994. 

PF 09-JUL-1993; U06497. 

PR iO-JUL-1992; US-911S45. 

PA (ENER-) ENERGY BIQSYSTEMS CORP. 

PI Denone SA. Kovacevich BRr Piddington CSr Ranbosek J; 

PI Young KD; 

DR WPU 94-035068/04. 

DR N-PSDB; 855131. . 

PT DNA encoding a bio catalyst which desulphurises fossil fuels - 

PT obtd. fron Rhodococcus rhodochrous bacteria; used to produce 

PT nicroorganisns uhich degrade organic sulphur cpds. 

PS Disclosure; Page 72-73; 104ppi English, 

CC Microorganisms transformed with the DNA encoding the 

CC enzyme/biocatalyst can be used to produce the enzyne/biocatalyst for 

CC the selective oxidative cleavage of carbon-sulphur bonds for 

CC desulphurisation of fossil fuels uhich contain organosulphur 

CC compounds. 

SQ Sequence 453 AA; 

DB 9J Score 65; Match 61.57.; Predicted No. 1.33e+0i; 

Matches 8; Conservative 3; Mismatches 2; Indels 0; Gaps 

Db 406 f Ipgsydefvdqv 418 

II M IPPii 
Qy 7 FLTGTPDEYVEQV 19 



RESULT 14 

ID R20746 standard; Protein; 234 AA. 
AC R20746; 

DT 2S-MAY-1992 (first entry) 



KW Receptor-typeprotein tyrosine phosphatase; cellular metabolism; 

KM cancer; diabetes. 

OS Homo sapiens. 

PN W092Q1050-A. 

PD 23-JAN-1992. 

PF ll-JUL-1991; U04892. 

PR ll-JUL-1990; US-551270. 

PR 26-FEB-1991; US-654188. 

PA (UYNY-) NEW YORK UNIV. 

PI Schlessinger J; 

DR WPI; 92-056865/07. 

PT Hunan receptor-type protein tyrosine phosphatase - has DNfi 

PT encoding it and antibodies specific for it, useful for screening 

PT drugs affecting R-PTPase activity, and detection of mutant genes 

PS Claim 5; Fig 5B; 77pp; English. 

CC The amino acid sequence is that of human receptor-type protein 
CC tyrosine phosphatase (R-PTPase) beta second conserved phosphatase. It 
CC is useful in methods for screening drugs and other agents which are 
CC capable of activating or inhibiting the R-PTPase activity and thereby 
CC affecting major pathways of cellular metabolism. Activation of 
CC R-PTPases, leading to dephosphorylation would serve as a counter- 
ed regulatory mechanism to prevent or inhibit growth, and may serve as 
rc an endogenous regulatory mechanism against cancer. Mutation or 
CC dysregulation of this receptor/enzyme system may promote susceptibility 
CC to cancer, diabetes, or other diseases associated with alterations in 
cc cellular phosphotyrosine metabolism. It can be used to raise antibodies 
re which can be used in immunoassays to determine the presence and amt. 
CC of R-PTPases, or in immunoelectron microscopy for in situ detection of 
CC R-PTPase. See also R20743-R20748 . 
SQ Sequence 234 AA; 

DB 4! Score 63; Match 38.97.', Predicted No. 2.03e+0i; 

Hatches 7; Conservative 7; Mismatches 4; Indels 0; Gaps 0; 

Db 122 df ileatqddyvlevrhf 139 

|::| : | :| " 

Qy 5 DLFLTGTPDEYVEQVAQY 22 



RESULT 15 

ID R13119 standard; Protein? 452 AA. 

AC R13119; 

DT 08-0CT-1991 (first entry) 

DE Phenylalanine hydroxylase. 

KM Hybrid; fusion; membrane translocation; binding region; HIV; 

KW infection; toxin; steroid; hormone; monoclonal antibody; antigen; 

KW diphtheria; exotoxin; phenylketonuria; cholera; interleult in; IL-2; 

KW protease? epidermal growth factor; ricin; tetanus; hexosaminidase; 

KW Shiga-like toxin A; SLT-A; PH; ligand; insulin! nuclease. 

OS Vibrio cholera. 

PN W09109871-A. 

PD ll-JUL-1991. 

PF 21-DEC-1990; U07619. 

PR 22-DEC-1989; US-456095. 

PR 14-JUN-1990; US-538276. 

PA (SERA-) SERAGEN INC. 

PI Murphy JRi 

DR WPI; 91-222845/30. 

DR N-PSDB; Q127I2. 

PT Hybrid molecules for targetting chemical entity to cell - have 

PT membrane trans-locating and cell binding-regions and used to 

PT treat HIV infection, genetic enzyme-deficiency disorders etc. 

PS Disclosure; Fig 13(1-3); 59pp; English. 

CC Hybrid molecules are produced by covalently linking 

CC (1) a portion (A) of the binding domain of a cell-binding ligand, 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

SQ 



(2) a portion (B) of a translocation domain of a protein able to 

translocate (C) across the cell cytoplamic membrane, and 

and (3) a portion (C) which is to be introduced into the cell. 

(A) is derived front a steroid or polypeptide hormone, a single-chain 
analogue of a monoclonal antibody able to bind an antigen expressed 
on the cell surface, or a polypeptide toxin. 

(B) is derived from a toxin (e.g. diphtheria toxin or Pseudomonas 
exotox in A) . 

(A) may be derived from insulin, interleuk ins 2. 3 or 
epidermal growth factor. 

Suitable enzymes in (C) include cholera toxin, ricin, 
hpxosamininidase A, protease, nuclease, SLT-A, etc. 
Specified examples are CT-A/DT-B' /IL-2, SLTA/DT-BVIL-2, 
ricin A/DT-B' /IL-2, HlVP-BP/DT-B'/IL-2 and the phenylalanine 
hydroxylase-DT-B' or their biologically active mutants. 
(CT-A= cholera toxin, DT-B'= truncated diphtheria toxin, 
SLTA= Shiga-like toxin A ; HIVP-BP= HIV protease binding protein. 
See also G12710-12. 
Sequence 452 AAJ 



6 or 



tetanus toxin. 



DB 3, Score 
Matches S! 



63; Match 42.17.: Predicted No. 2.03e+0l; 
Conservative 6; Mismatches 5; Indels 05 Gaps 0; 



Db 304 qeiglaslgapdeyiekla 322 

I I - 1 I 1 1 - 1 '• I 
Qy 2 RDVDLFLTGTPDEYVEQVA 20 

Search completed: Fri Mar 24 07:42:29 1995 
Job time : 13 sees. 
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HPsrch pp protein - protein database search, using Smith-Waterman algorithm 
Run on: 



Fri Mar 24 07:41:35 1995; MasPar time 5.07 Seconds 

119.562 Million cell updates/sec 

Tabular output not generated. 



Title: 

Description: 
Perfect Score: 
Sequence: 

Scoring table: 
Searched: 



XUS-G8-300-510-1 

(1:27) from US08300510.pep 

195 

1 KRDVDLFLTGTPDEYVEQVAQYKALPV 27 

PAM 150 
Gap 14 

75511 seqs, 22468834 residues 



Database: 



pir43 

" 1 ANN01 
2 ANN02 



4 UNANN01 

5 UNANN02 

6 UNANNQ3 

7 UNANN04 
3 UNANN05 

9 UNANN06 

10 UNREV1 

11 UNREV2 

12 UNREV3 

Statistics: Nean 30.206; Variance 58.847; scale 0.513 

Predicted No. is the number of results expected by chance to have a 
score greater than or equal to the score of the result being printed^ 
and is derived by analysis of the total score distribution. 
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Hatch Length DB 


ID 
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1 
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100.0 
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rcajor cat allergen F 
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najor allergen chain 


i . d/ e-cu 
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189 


96.9 
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heat shock protein 8 


2 • 4Qe+uu 


6 


70 


35.9 


85 


11 
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DNA-directed DNA pol 
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1 . I3e+0l 
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33.3 
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11 
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Glutanate Synthase ( 


1 . 53e+01 
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33.3 


1536 
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1 ,53e+01 
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2 


LUilSl 
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1 . 53e+01 
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65 


33.3 


4969 


9 


A371 13 


ryanodine receptor* 


4 C7 "7 — i. A 1 

1 . 53e+0 1 
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33.3 
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agrin precursor - ch 


4 C*1 — i_ A 4 

1 . 53e+01 
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32.8 
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A36690 


sucrose alpha-glucos 


2 . 07e+01 
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64 


32.8 


1556 


5 


D36793 


hypothetical protein 


2 . u/e+Ul 
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64 
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9 


A49413 


perilipin A - rat 


2.07e+01 
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63 


32.3 


610 


12 


S12051 


protein-tyros ine-pho 


2. 78e+01 


24 


63 
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32.3 


1442 


9 
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32.3 


246 


1 


WHRTF 


phenylalanine 4-nono 
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32.3 


476 


9 
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32.3 
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protein- tyros ine-pho 
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S25282 


hypothetical protein 


2.78e+01 
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32.3 
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2.78e+01 


34 


63 


32.3 


2307 


9 


A46700 


receptor-type protei 


2.78e+01 


35 


63 


32.3 


704 


5 


A26125 


heat shock protein 9 


2.78e+01 


36 


63 


32.3 


332 


10 


S33035 


hypothetical protein 


2.78e+0l 


37 


63 


32.3 


452 


1 


WHHUF 


phenylalanine 4-nono 


2.78e+01 


38 


63 


32.3 


2314 


9 


A46151 


protein-tyros ine-pho 


2.78e+01 


39 


62 


31.3 


76 


6 


C47342 


let 3'-region hypoth 


3.72e+01 


40 


61 


31.3 


764 


8 


A45321 


protein-glutamine ga 


4.97e+01 


41 


60 


30.8 


453 


4 


A42271 


tryptophan 5-nonooxy 


6.61e+01 


42 


60 


30.8 


564 


7 


S30779 


HCMi protein - yeast 


6.61e+01 


43 


60 


30.8 


2331 


10 


S32776 


structural protein - 


6.61e+01 


44 


60 


30.8 


411 


5 


A34526 


0RF1 protein - Orgyi 


6.61e+01 


' c 


r ' 


7 " a> 




A 


1 Oft "ft A 







ALIGNMENTS 



RESULT 1 

ENTRY 

TITLE 

ORGANISM 

DATE 

ACCESSIONS 
REFERENCE 
^authors 

#journal 
#title 

#accession 
Mstatus 
##molecule 
##residues 
SUMMARY 

DB 9 J Score 
Matches 27? 



A53283 itype fragment 

major cat allergen Fel d I alpha chain - cat (fragment,) 
Sformal name Felis silvestris catus #commcn_name domestic cat 
12-May-l994 8sequence_revision 12-May-1994 #text .change 

12-May-1994 
A53283 

A53283 , u . 

Dufforti O.A.S Carreira, J.J Nitti, G.i Polor F.J Lonbardero. 

Hoi. Immunol. (1991) 28:301-309 

Studies on the biochemical structure of the major cat 

allergen Felis domesticus I. 
A53283 

prel iminary 
type protein 

i-40 ##label DUF 
tlength 40 #checksum 3032 

195? Match 100.07.? Predicted No. 9.13e-22? 
Conservative 0? Mismatches 0? Indels 0? Gaps 0? 



Db 



Qy 



7 krdvdlf Itgtpdeyveqvaqykalpv 33 

M 1 1 ! 1 1 1 1 M I M M i M ! 1 1 1 1 1 1 1 
1 KRDVDLFLTCTPDEYVEQVAQYKALPV 27 



RESULT 2 

ENTRY 

TITLE 

ORGANISM 

DATE 

ACCESSIONS 
REFERENCE 
^authors 

^journal 
fttitle 

^accession 
tftmolecule 
##residues 
GENETICS 
#gene 
ttintrons 
FEATURE 
1-18 
19-88 

SUMMARY 

DB 91 Score 
Matches 26? 



JC1126 #type complete 

major allergen chain 1 precursor B - cat 

ifornal name Felis silvestris catus iconnon.nane domestic cat 
31-Dec-1993 Ssequence.revision 31-Dec-1993 #text_change 

31-Dec-1993 
JC1126 
JC1126 

Griffithr I.J.J Craig* 5.5 Pollock, J. ? Yu, X.B.? 

Morgensterrw J. P.? Rogers r B.L. 
Gene (1992) 113;263-268 

Expression and genomic structure of the genes encoding Fdl, 

the major allergen from the domestic cat. 
JC1126 
type DNA 

1-88 ##label GRI 

Chi 

17/1? 79/3 

^domain signal sequence ftstatus predicted #label SIGN 
^product major allergen chain 1 ^status predicted ilabel 
HAT 

tlength 88 #nolecular-ueight 9586 ^checksum 4095 



189; Match 96.37.; Predicted No. 1.27e-20i 
Conservative li Mismatches 0? Indels 0? 



Gaps 0; 



Db 25 krdvdlf Itgtpdeyveqvaqynalpv 51 
Qy 1 KRDVDLFLTGTPDEYVEQVA8YKALPV 27 



RESULT 3 



major allergen chain 1 precursor A - cat 

tformal_name Felis silvestris catus •common_name domestic cat 
31-Dec-1993 #sequence_revis ion 31-Dec-1993 •text^change 

31-Dec-1993 
JC1136 
JC1126 

Griffithr I.J.J Craigr S.J Pollock r J.J Yur X.B.J 

Morgenstern? J.P.J Rogers f B.L. 
Gene (1992) 113:263-268 

Expression and genomic structure of the genes encoding Fdlr 

the major allergen from the domestic cat. 
JC1136 
••molecule type DNA 
••residues 1-92 Hlabel GRI 

GENETICS 

Chi 

21/11 83/3 



TITLE 
ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
•authors 

•journal 
•title 

•accession 



•gene 
•introns 
FEATURE 
1-22 
23-92 

SUMMARY 



•domain signal sequence •status predicted •label SIG\ 

•product major allergen chain 1 •status predicted #label 
MAT 

•length 92 •molecular-weight 10072 •checksum 4988 



DB 95 Score 189; Match 96.37.; Predicted No. 1.27e-20i 

Hatches 26? Conservative 1? Mismatches 0; Indels OS Gaps 0? 



Db 29 krdvdlf 1 tgtpdeyveqvaqynalpv 55 
Qy 1 KRDVDLFLTGT^ 27 



RESULT 4 

ENTRY A29617 «type complete 

TITLE glutanate synthase (NADPH) (EC 1.4.1.13) large chain - 

* Escherichia coli 
ORGANISM ^formal name Escherichia coli 

DATE 05-Jun-1988 #sequencej*evision 05-Jun-1988 #text_change 

31-Dec-1993 
ACCESSIONS A29617 

REFERENCE A91585 - . 

•authors Oliverr G.J Cosset, G.J Sanchez-Pescador. R.J Lozoya, E.r Ku, 

L.M.J Floresz N.J Eecerrilr B.J Valle. F.J Bolivar, F. 
•journal Gene (1987) 60M-11 

•title Determination of the nucleotide sequence for the glutanate 

synthase structural genes of Escherichia coli K-12. 
•cross-references MUID:83152492 
•contents K12 
•accession A29617 

••molecule type DNA 

••residues 1-1514 ••label QLI 

••note sequence not compared to nucleotide translation 

GENETICS 

•gene gltB 

•map position 69 . , 

KEYWORDS flavoproteinJ glutanate biosynthesis J iron-sulfur protein, 

NADP? oxidoreductase 
SUMMARY #length 1514 •molecular-weight 166224 •checksum 756 

DB 6J Score 74 J Match 53.87.J Predicted No. 9.16e-0U 

(latches 7 J Conservative 5J Mismatches U Indels 0J Gaps Oi 

Db 1314 velyltgdandyv 1326 

I MM I I ::: M 
Qy 4 VDLFLTGTPDEYV 16 



RESULT 
ENTRY 
TITLE 
ORGANISM 
DATE 



ACCESSIONS 
REFERENCE 
^authors 
^journal 
Stitle 



A44983 ttype complete 
heat shock protein 83 - Trypanosoma brucei 
Sformal name Trypanosoma brucei 

14-May-1993 #sequence_revision 14-May-1993 *text_change 

30-Bep-1993 
A44983 
A44983 

Mottram, J.C.J Murphy, W.J.; Agabianr N. 
Hoi. Biochem. Parasitol. (1989) 37:115-128 
A transcriptional analysis of the Trypanosoma brucei hsp83 

gene cluster. 
A44983 

prel iminary 
##molecule type DNA 
##residues 1-703 Mlabel MOT 

##cross-references GB:X14176 
TLASSIFICATION Ssuperf ami ly heat shock protein 90 

SUMMARY tleSgth 703 Molecular-Height 80729 Checksum 8300 

DB 5! Score 71; Match 41.77.; Predicted No. 2.40e+00; 

Matches 10; Conservative 7; Mismatches 5; Indels 2, Gaps 2. 

Db 488 rrgmevlfmtdpideyvmqqvkef 511 

:| IIM illl Ml ;: 
Qy 1 KRDVD-LFLTGTPDEYV-EQVAQY 22 



^accession 
tt#status 



RESULT 
ENTRY 
TITLE 
ORGANISM 



DATE 

ACCESSIONS 
REFERENCE 
#authors 



S39326 #type complete 
auxin-induced mRNA - Arabidopsis thaliana 
#formal_name Arabidopsis thaliana #common_r.ame mouse-ear 
cress 

19-May-1994; #sequence_revi sion 19-Mau-1994J *text_change 

19-May-1994 
539326 
S39321 

Krivitzky, M.J Bonnet, R. ; Jean-Jacques, I.', Kreis, M. ; 
Lecharny. A. 

^submission submitted to the EMBL Data Library, December 1993 
^accession S39326 

#*status preliminary 

##residues 1-85 SSlabel KRI 

fttcross-references EMBL?Z29042 
SUMMARY Slength 85 Smolecular-ueight 9794 ^checksum 7478 

DB 11 ? Score 70; Match 40.97.; Predicted No. 3.29e+00; 

Matches 9; Conservative bi Mismatches 7; Indels 0; Gaps 

Db 22 emalklkgipyeyveeilenks 43 

s: I I I I llll" : l ; 
Qy 3 DVDLFLTGTPDEYVEQVAQYKA 24 



o; 



RESULT 7 

ENTRY 

TITLE 

ORGANISM 

DATE 

ACCESSIONS 

REFERENCE 
^authors 
^submission 
^accession 



S08119 fttype complete 

heat shock protein 83 - Trypanosoma brucei brucei 

Sformal name Trypanosoma brucei brucei 

07-Sep-1990 t sequence.revision 07-Sep-1990 »text_change 

18-Jun-1993 
S08119 
S08119 

Mottram, J. ; Murphy, W.J Agabian, N. 

submitted to the EMBL Data Library, January 1989 

S08119 



Win' on 1 1 a inr a _ P"M 



^residues 1-703 Mlabel MOT 

##cross-ref erences EMBL:X14176 
GENETICS 

#gene hsp83 
CLASSIFICATION Ssuperfamily heat shock protein 90 

SUMMARY ilength 7Q3 #molecular-ueight 80715 ^checksum 8246 

DB 5J Score 69? Match 41.7V.5 Predicted No. 4.50e+00: 

Hatches 10; Conservative 7; Mismatches 5; Indels 2; Gaps 2 

Db 488 rrgmevlfmtdpideyvmqqvkdf 511 

' I MM INI Ml 
Qy 1 KRBVD-LFLTGTPDEYV-E6VAQY 22 



RESULT 8 
ENTRY 
TITLE 
ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
^authors 
^submission 
•accession 
Sfstatus 
8#residues 



S33144 #type complete 
anthocyanidin hydroxylase - apple tree 
#formal name Malus sp. #common_name apple tree 
22-Nov-1993f *sequence_revision 22-Nov-i993S #text_char.ge 

22-Nov-1993 
S33144 
S33144 
Daviesi K.M. 

submitted to the EMBL Data Library, March 1993 
S33144 

pre I in i nary 

1-357 ftttlabel DAV 



••cross-references EMBL?X71360 
SUMMARY Slength 357 ftnolecular-ueight 40332 ^checksum 9659 



DB 11! Score 67; Match 40.9X; Predicted No. S.35e+00; 

Matches 9J Conservative 5; Mismatches 7; Indels 1; 

Db 156 krdlsiw-pqtpadyieataey 176 

IIP : I Ml'- I Ml 
Qy 1 KRDVDLFLTGTPDEYVEQVAQY 22 



Gaps 1 



RESULT 9 
ENTRY 
TITLE 

ALTERNATE.NAHES 
ORGANISM 
DATE 



ACCESSIONS 
REFERENCE 
•authors 
•journal 
•title 



A44888 #type fragment 

heat shock protein 90 - Leishmania donovani (fragment) 
heat shock protein* 90K 
§formal name Leishmania donovani 

17-Feb-1994 •sequence.revis ion 17-Feb-1994 »te>;t_char.ge 

17-Feb-1994 
A44888 
A44888 

de Andrade, C.R.J Kirchhoff, L.V.J Donelson. J.E.! Otsu. K 
J. Clin. Microbiol. (1992) 30?330-335 

Recombinant Leishmania Hsp90 and Hsp70 are recognized^by s 
from visceral leishmaniasis patients but not Chagas' 
disease patients, 
•cross-references MUID'.92165942 
•contents strain Sudan SI 
•accession A44888 

••status preliminary 
••molecule type mRNA 
••residues 1-452 ••label DEI 

••cross-references NCBIN:83989; NCBIP:83991 
#tnote sequence extracted from NCBI backbone 

CLASSIFICATION Ssuperfamily heat shock protein 90 
KEYWORDS heat shock protein; phosphoprotein 

SUMMARY « length 452 ^checksum 8278 



Matches 



10; Conservative 7; Mismatches 5i Indels 2; Gaps 2; 



Db 237 rrglevlfmtepideyvmqqvkdf 260 

M J! IP! Mil Ml ss 
Qy 1 KRDVD-LFLTGTPDEYV-EGVAQY 22 



RESULT 10 

ENTRY 

TITLE 

ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
•authors 
•journal 
•title 



•accession 
••status 
••molecule 
••residues 
••cross-re 
••note 

KEYWORDS 

SUMMARY 



A45915 fttype complete . 
DNA-directed DNA polymerase (EC 2,7,7.7) III alpha chain - 

Salmonella typhimurium 
•formal name Salmonella typhimurium 

10-Mar-i994 tsequence.revision 10-Mar-i994 ttext.change 

iS-Nov-1994 
A45915 
A45915 

Lancy* E.D.? Lifsics, M.R.5 Munsonr P.J Maurer, R. 
J. Bacteriol. (1989) 171:5581-5586 

Nucleotide sequence of dnaE, the gene for the polymerase 
subunit of DNA polymerase III in Salmonella typhimurium, 
and a variant that facilitates grouth in the absence of 
another polymerase subunit. 

A45915 

prel iminary 
type DNA 

1-1160 ##label LAN 
f&rencss GB:M29701 

translation of nucleotide sequence not given 
glycosidase; hydrolase? nucleotidyltransferase 
•length 1160 inolecular-ueight 130118 Checksum 9028 



Predicted No. 1.13e+0l; 



DB 6; Score 665 Match 35.0XJ 

8? Mismatches 4; Inoels 1> 



Matches 



7; Conservative 



Gaps 



l; 



Db 957 Iglyltghpinqylkeiery 976 

; PHI | ::|: : I 
Qy 4 VDLFLTGTP-DEYVEQVAGY 22 



RESULT 
ENTRY 
TITLE 



11 



A38234 Stype complete 

oaoglutarate dehydrogenase (lipoamide) (EC 1.2.4.2) precursor 
- human 

2-oxoglutarate: lipoamide 2~oxidoreductasei 

alpha-ketoglutarate dehydrogenase 
•formal name Homo sapiens •common_name man 
3i-Dec-1993 #sequence_revision 31-Dec-1993 •text.change 

31-Dec-1993 
A38234 
A38234 

Koike* K.; Uratar Y.; Gotor S. 

Proc. Natl. Acad. Sci. U.S.A. (1992) 89:1963-1967 
Cloning and nucleotide sequence of the cDNA encoding human 
2-oKoglutarste dehydrogenase (lipoamide). 
•cross-references MUID:92i793Ql 
•contents fetal liver 
•accession A38234 
•tmolecule.type mRNA 
••residues 1-1003 ••label KOI 

••cross-references GB:D10523; NCBIP:87352 



ALTERNATE JIAHES 

ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 

•authors 

•journal 

•title 



••note 
KEYWORDS 

FEATURE 
1-40 



"sequence extracted from NCBI backbone 
mitochondrion; oxidoreductasei thiamine pyrophosphate; 
tricarboxylic acid cycle 

•domain transit peptide (mitochondrion) tstatus 

r ^ ->rf * >- * ~ - 1 ^ 1 -(-^1 ThlD\ i^^m— 



4i-ioo3 ftproduct oxoglutarate dehydrogenase (lipoamide) ftstatus 

predicted Slabel MAT 
SUMMARY Slength 1003 Hmolecular-ueight 113239 #checksum 617 

DB 8! Scorp 66; Match 46.7X8 Predicted No. 1.13e+GU 

Matches 7? Conservative 6; Mismatches 2! Indels 0? Gaps Or 

Db 47 epf Isgtssnyveem 61 

' 1 1 5 1 I : 'III" 
Qy 5 DLFLTGTPDEYVEQV 1? 



RESULT 12 
ENTRY 
TITLE 
ORGANISM 
DATE 



ACCESSIONS 
REFERENCE 
ttauthors 

# journal 
fttitle 



S23484 #type complete 

xylZ protein - Pseudomonas putida plasnid pWWO 
§formal name Pseudomonas putida 

22-Nov-1993J #sequence_revision 22-Nov-l993J #text_cnange 

22-Nov-1993 
S23484 

S23477 r • u A . 

Neidle, E.L.! Hartnett, C.J Qrnston, L.N.J Bairoch. A., 

Rekikr M.J Harayamar S. 
Eur. J. Biochen. (1992) 204U13-120 

Cis-diol dehydrogenases encoded by the TOL pWWO plasnid xylL 
g«ne and the Acinetobacter calcoaceticus chromosomal benD 
gene are members of the short-chain alcohol dehydrogenase 
superf ami ly. 
#cross-references MUID:92155191 
^accession S23484 

##status preliminary 
^residues 1-336 M label NEI 

##cross-references EMBL:M64747 
SUMMARY ' # length 336 ftmolecular-ueight 36220 ttchecksum 5068 

DB 10; Score 66J Match 50. 07. J Predicted No. 1.13e+0U ' 

Hatches 10; Conservative 4J Mismatches 5: Indels 1! Gaps l. 

Db 298 evdiylcgpppm-veavsqy 316 

'I I" I II IM' II 
Qy 3 DVDLFLTGTPDEYVEQVAGY 22 



RESULT 13 
ENTRY 
TITLE 

ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 

Sauthors 

ftjournal 

fttitle 

ftcross-refere 
jf accession 
^molecule 
##residues 
##note 

REFERENCE 
^authors 
^journal 
#contents 
^accession 



DJEC3A ttype complete 

DNA-directed DNA polymerase (EC 2.7.7.7) III alpha chain - 

Escherichia coli 
ftformal name Escherichia coli 

31-Dec-1988 #sequence_revis ion 31-Dec-1988 #text_char.ge 

12-May-1994 
C28390; A37441 
A91855 

Tomasieuicz, H.G.J McHenryi C.S. 
J. Bacterid. (1987) 169:5735-5744 
Sequence analysis of the Escherichia coli dnaE gene, 
nces MUID:8805879i 
C28390 
type DNA 

1-1160 Sftlabel TOM 

the nucleotide sequence has been corrected in reference 
A37441 

A37441 

Tomasiewicz r H.G.; McHenryr C.S. 
J. Bacterid. (1991) 173:4549 
erratum 
A37441 



##residues~ 156-183 SSlabel T02 
##cross-references GB:M19334 
COMMENT This protein is the catalytic con.por.ent of the DNA polynerase I II 
COMMENT ^ composed of alpha, epsilon, and theta chains) that 

can repair short gaps created by nuclease in duplex DNA. For 
efficient replication of the long, single-stranded templates, pol 
III requires the auxiliary chains beta, gamna, and delta. 



GENETICS 
Sgene 



dnaE 



#nap position 4 min . 
CLASSIFICATION Ssuperf ami ly DNA-directed DNA polymerase III alpha chain 
KEYWORDS DNA replication; nucleotidyltransferase 

SUMMARY #length 1 160 #nolecular-ueight 129903 ftchecksun 8714 



DB li Score 

Hatches 7? Conservative 



66; Match 35.07.; Predicted No. 1.13e+0l; 

4; Indels l; Gaps l; 



8; Mismatches 



Db 957 Iglyltghpinqylkeiery 976 

i pill MM 5 11 ; i 
Qy 4 VDLFLTGTP-DEYVEQVAQY 22 



RESULT 
ENTRY 
TITLE 



14 



C41659 #type coriplete 

benzoate 1 ,2-dioxygenase (EC 1,14.12.10) XylZ protein - 

Pseudononas putida plasnid pWWO 
#fornal nane Pseudononas putida 

30-Jun-1992 #sequence.revision 30-Jun-1992 #text_change 

30-Sep~I993 
C41659 
A41659 

Harayana, S.5 Rekik, M.? Bairoch, A.; Neidle, E.L.J Ornston. 
L.N. 

J, Bacterid. (1991) 173:7540-7548 
Potential DNA slippage structures acquired during 

evolutionary divergence of Acinetobacter calcoaceticus 
chronosonal benABC and Pseudononas putida TQL pWWO plasnid 
xylXYZr genes encoding benzoate dioxygenases . 
Scross-references MUID?92041666 
^accession C41659 

##status preliminary 
8§nolecule type DNA 
^residues 1-336 ##label HAR 

##cross-references GBJM64747 
GENETICS 

#oenone plasnid . 

SUMMARY tlength 336 Snolecul ar-ueight 36220 ^checksum 5068 



ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
#authors 

♦journal 
♦title 



DB 65 Score 66; Match 50.07.; Predicted No. I.13e+015 

Conservative 4; Mismatches 5; Indels 1! Gaps 



Matches 



io; 



l; 



Db 298 evdiylcgpppn-veavsqy 316 

I I II I 5 "' 
Qy 3 DVDLFLTGTPDEYVEQVAQY 22 



15 



RESULT 

ENTRY 

TITLE 

ORGANISM 

DATE 



ACCESSIONS 
REFERENCE 
♦authors 



S39510 ttype complete 

Glutanate Synthase (EC 1.4.7.1) - Antithamnion sp. 
♦formal narce Antithannion sp. 

19-May-1994; #sequence_revision 19-May-1994; Stext.change 

19-May-1994 
S39510 
S39510 

Valentin, K.5 Kostrzeua. M.J Zetsche, K. 



I , a- 



07 l TT-'S'C; 



0; 



#iitle~ Glutamate synthase is pi ast id-encoded in a red alga? 

implications for the evolution of glutamate synthases, 
•accession S39510 

••status preliminary 
##residues i-1536 ##label VAL 

•♦cross-references EMBUZ21705 
SUMMARY tlength 1536 ftnolecular-ueight 171111 #checksum 668? 

DB 115 Score 65? Hatch 33.37.; Predicted No. 1.53e+015 

Hatches 5; Conservative 6; Mismatches 4; Indels 0; Gaps 

Db 1340 kgihlylkgeandyv 1354 

: MM I —II 
Qy 2 RDVDLFLTGTPDEYV 16 

Search completed! Fri Mar 24 07?4l:57 1995 
Job time : 22 sees, 

1~\ t~T\ 

\\ M /I I 
! \_\/ I 1 



i ! 



(Tli) 



Rpleasp 2.0 John F. Collins & S. S. Sturrock, Biocomputing Research Unit. 
Copyright (c) 1993, 1994 by University of Edinburgh, U.K. 
Distribution rights by Intel 1 iGenetics , Inc. 



MPsrch.pp protein - protein database search* using Smith-Waterman algorithm 
Run on: 



Fri Mar 24 07:41:03 1995; HasPar time 3.49 Seconds 

109.451 Million cell updates/sec 

Tabular output not generated. 



Title: 

Description: 
Perfect Score: 
Sequence: 

Scoring table: 



Searched: 
Database: 



Statistics: 



>OS~03-300-510-1 

(1:27) from US0S300510.pep 

195 

1 KRDVDLFLTGTPDEYVEQVAQYKALPV 27 

PAM 150 
Gap 14 

40292 seqsr 14147368 residues 

suis,s-prot30 

1 SPT1 

2 SPT2 

3 SPT3 

4 SPT4 

5 SPT5 

6 SPT6 

7 SPT7 

Mean 31.852; Variance 49.755; scale 0.640 



Predicted No. is the number of results expected by chance to have a 
score greater than or equal to the score of the result being printed r 
and is derived by analysis of the total score distribution. 



CIIMV 



Result Query 

No. Score Match Length DB ID 



Description 



Pred. No. 
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MAJOR ALLERGEN I POLY 3.32e-27 

MAJOR ALLERGEN I POLY 3.32e-27 

GLUTAMATE SYNTHASE (N 1.4U-01 

HEAT SHOCK PROTEIN 83 9.34e-01 

HEAT SHOCK PROTEIN 90 i.94e+00 

TOLUATE 1,2-DIOXYGENA 2.78e+00 

DNA POLYMERASE III. A 2.78e+00 

2-0X0GLUTARATE DEHYDR 2.78e+00 

DNA POLYMERASE III. A 2.78e+00 

AGRIN PRECURSOR. 3.97e+00 

ANNEXIN I (LIPOCORTIN 3.97e+00 

RYANODINE RECEPTOR, C 3.97e+00 

FERREDOX IN-DEPENDENT 3.97e+00 

HYPOTHETICAL GENE 67 5.64e+00 

SUCRASE-ISOMALTASE, I 5.t4e+00 

HYPOTHETICAL PROTEIN 7.98e+00 

PROTEIN-TYROS I NE PHOS 7.98e+00 

HEAT SHOCK LIKE 85 KD 7.98e+00 

PHENYLALANINE-4-HYDR0 7.98e+00 

EXCINUCLEASE ABC SUBU 7.98e+00 

HYPOTHETICAL 24.6 KD 7.98e+00 

PROTEIN-TYROS I NE PHOS 7.98e+00 

PHENYLALANINE-4-HYDR0 7.98e+00 

PROTEIN-TYROS I NE PHOS 7.98e+00 

PHENYLALANINE-4-HYDR0 7.98e+00 

HYPOTHETICAL 21.1 KD 1.12e+01 

LACTICIN 481/LACT0C0C 1.12e+01 

HEAT SHOCK PROTEIN 90 1.58e+01 

MYOSIN HEAVY CHAIN IB 2.20e+01 

PHENYLALANINE-4-HYDR0 2.20e+01 

P48 PROTEIN. 2.20e+01 

RNA-DIRECTED RNA POLY 2.20e+01 

HCM1 PROTEIN. 2.20e+01 

D-ALANYL-D-ALANINE CA 2.20e+01 

MYOSIN II HEAVY CHAIN 3.06e+01 

HEAT SHOCK PROTEIN 83 3.06e+01 

ALANINE RACEMASE (EC 3.06e+01 

MITOCHONDRIAL RIB0NUC 3.06e+01 

MYELIN PO PROTEIN PRE 4.23e+01 

HYPOTHETICAL 78.8 KD 4.23e+01 

HYPOTHETICAL PXBL-I P 4.23e+01 

TUBULIN BETA-2 CHAIN. 4.23e+01 

ELONGATION FACTOR G ( 4.23e+01 

C0LICIN El PROTEIN. 4.23e+01 

CFA/I FIHBRIAL SUBUNI 4.23e+01 



ALIGNMENTS 



RESULT 
ID 
AC 
DT 



1 



FELB_FELCA STANDARD; PRT; 88 AA. 

P30439. 

01-APR-1993 (REL. 25, CREATED) 
DT 01-APR-1993 (REL. 25, LAST SEQUENCE UPDATE) 
DT 01-JUN-1994 (REL. 29, LAST ANNOTATION UPDATE) 

DE MAJOR ALLERGEN I POLYPEPTIDE CHAIN 1 MINOR FORM PRECURSOR (FEL D 
DE (CAT-l) (AG 4). 
GN CHI. 

oc eukaryota; metazoa; chordataj vertebrata; tetrapoda; mammalia; 
OC eutheria; carnivora. 



I) 



RP SEQUENCE FROM N.A.. AND SEQUENCE OF 19-88. 

RM 92052157 

RA MORGENSTERN J. P., GRIFFITH I.J., BRAUER A.W., ROGERS B.L., 

RA BOND J.F., CHAPMAN M.D., KUO M.-C.i 

RL PROC. NATL. ACAD. SCI. U.S.A. 88:9690-9694( 1991 ) . 

RN [23 

RP SEQUENCE FROM N.A. 

II GRIFFITH I.J., CRAIG S., POLLOCK J., YU X.-B.. MORGENSTERN J. P., 

RA ROGERS B.L., 

RL GENE 113:263-268(1992). 

RN C3] 

RP SEQUENCE OF 19-58, AND CHARACTERIZATION. 

RM 91287714 

RA DUFFORT O.A., CARRE IRA J. > NITTI G., POLO F. > LQMBARDERO M.J 

RL MOL. IMMUNOL. 28:301-309(1991). 

RN C43 

RP CHARACTERIZATION. 

RA LEITERMANN K. , OHMAN J.L. JR.? 

RL J. ALLERGY CLIN. IMMUNOL. 74; 147-153< 1991 ) . 

CC -'- DISEASE: MAJOR ALLERGEN PRODUCED BY THE DOMESTIC CAT. 

CC -«- SUBUNIT: HETEROTETRAMER COMPOSED OF TWO NON-COVALENTLY LINKED 

CC D I SULF I DE-LINKED HETERODIMER OF CHAINS 1 AND 2. 

Cf* -'- TISSUE SPECIFICITY: SALIVA, AND SEBACEOUS GLANDS. 

PC -'- ALTERNATIVE PRODUCTS: USAGE OF TWO DIFFERENT INITIATOR MET ARE 

CC ' RESPONSIBLE FOR THE PRODUCTION OF TWO FORMS OF THE SIGNAL SEQUENCE 

CC OF THIS ALLERGEN SUBUNIT. 

CC -!- SIMILARITY: TO UTEROGLOBIN. 

DR EMBLJ M74953! FDFELDIB. 

DR PIR; JC1126; JC1126. 

DR PROSITE! PS00403; UTEROGLOBIN^ . 

DR PROSITEl PS00404: UTER0GL0BIN.2. 

KW ALLERGEN; SIGNAL; ALTERNATIVE SPLICING. 

FT CHAIN 1 19 88 MAJOR ALLERGEN I POLYPEPTIDE CHAIN 1. 

FT DISULFID 21 21 INTERCHAIN (POTENTIAL). 

FT DISULFID 88 88 INTERCHAIN (POTENTIAL) . 

FT VARIANT 47 47 K -> N. 

FT CONFLICT 78 78 L -> V (IN REF. 2). 

SQ SEQUENCE 88 AA; 9614 MW; 39445 CN? 

DB 2! Score 195; Match 100. OX; Predicted No. 3.32e-27; 
Matches 27; Conservative 0; Mismatches 0; Indels 0; Gaps 

Db 25 Itrdvdlfltgtpdeyveqvaqykalpv 51 

ililillitMlliliillMllllli 
Qy 1 KRDVDLFLTGTPDEYVEQVAQYKALPV 27 



RESULT 2 

ID FELA.FELCA STANDARD; PRT; 92 AA. 

AC P30438; 

DT 01-APR-1993 (REL. 25, CREATED) 

DT 01-APR-1993 (REL. 25, LAST SEQUENCE UPDATE) 

DT 01-JUN-1994 (REL. 29, LAST ANNOTATION UPDATE) 

DE MAJOR ALLERGEN I POLYPEPTIDE CHAIN 1 MAJOR FORM PRECURSOR (FEL D I) 

DE (CAT-1) (AG 4). 

GN CHI. 

OS FEL IS CATUS (CAT) . 

oc eukaryota; metazoa; chordata; vertebrata; tetrapoda; mammalia; 

oc eutheria; carnivora. 

rn m 

RP SEQUENCE FROM N.A., AND SEQUENCE OF 23-92. 

RC TISSUE=SALIVARY GLAND; 

RM 92052157 

-,. Mnprtrki" Trr > kl 1 D rctrciTU i ■ ppahcp a u . pn^cpc R I ..... 



RA BOND J.F., CHAPMAN M.D., KUO M.-C.J 

RL PROC. NATL. ACAD. SCI. U.S.A. 88:9690-9694(1991) . 

RN C2J 

RP SEQUENCE FROM N.A. 

Z GRIFFITH I.J., CRAIG 8.. POLLOCK J., YU X.-B., HORGENSTERH J.P., 

RA ROGERS B.L., 

RL GENE 113:263-268(1992) . 

RN [31 

RP SEQUENCE OF 23-62, AND CHARACTERIZATION. 

RM 9 1 2877 i 4 

RA DUFFORT O.A., CARREIRA J.- NITTI G., POLO F., LQMBARDERO H.J 

RL MOL. IMMUNOL. 28:301-309(1991). 

RN C43 

RP CHARACTERIZATION. 

RA LEI TERMANN K. . OHMAN J.L. JR.f 

RL J. ALLERGY CLIN. IMMUNOL. 74:147-153(1991). 

TC -'- DISEASE: MAJOR ALLERGEN PRODUCED BY THE DOMESTIC CAT. 

CC -'- SUBUNIT: HETEROTETRAMER COMPOSED OF TWO NON-COVALENTLY LINKED 

CC DISULF I DE-LINKED HETERODIMER OF CHAINS 1 AND 2. 

CC -'- TISSUE SPECIFICITY: SALIVA, AND SEBACEOUS GLANDS. 

rr -*•- ALTERNATIVE PRODUCTS: USAGE OF TWO DIFFERENT INITIATOR MET ARE 

CC ' RESPONSIBLE FOR THE PRODUCTION OF TWO FORMS OF THE SIGNAL SEQUENCE 

CC OF THIS ALLERGEN SUBUNIT. 

CC -!- SIMILARITY : TO UTEROGLOBIN. 

DR EMBL; M74952! FDFELDI. 

DR PIRf JC1136; JC1136. 

DR PROSITE; PS00403? UTEROGLOBINS . 

DR PROSITE: P300404J UTER0GL0BIN_2. 

KW ALLERGEN *, SIGNAL; ALTERNATIVE SPLICING. 

FT CHAIN 1 23 92 MAJOR ALLERGEN I POLYPEPTIDE CHAIN 1. 

FT DISULFID 25 25 INTERCHAIN (POTENTIAL). 

FT DISULFID 92 92 INTERCHAIN (POTENTIAL). 

FT VARIANT 51 51 K -> N. 

FT CONFLICT 5 5 R -> C (IN REF. 2). 

FT CONFLICT 13 18 W -> S (IN REF. 2). 

FT CONFLICT 82 82 L -> V (IN REF. 2). 

SQ SEQUENCE 92 AAJ 10252 MW; 43206 CN! 

DB 2J Score 1955 Match 100. OX J Predicted No. 3.32e-27; 

Matches 27; Conservative 0; Mismatches 0! Indels 0; Gaps 0; 

Db 29 krdvdlfltgtpdeyveqvaqykalpv 55 

I lilllMIIM li lllilll UMI ! 

Qy 1 KRDVDLFLTGTPDEYVEQVAQYKALPV 27 



RESULT 3 _ . . , . 

ID GLTB_ECOLI STANDARD ! PRTJ 1514 AA. 

AC P09831? 

DT 01-MAR-1989 (REL. 10- CREATED) 

DT 01-MAR-1989 (REL. 10, LAST SEQUENCE UPDATE) 

DT 01-0CT-1994 (REL. 30, LAST ANNOTATION UPDATE) 

DE GLUTAMATE SYNTHASE (NADPH) LARGE CHAIN PRECURSOR (EC 1.4.1.13) 

DE (NADPH-GOGAT) . 

GN GLTB. 

II PROKARYOTA^ GRAcIlICUTES; SCOTOBACTERIA; FACULTATIVELY ANAEROBIC RODS; 

OC ENTEROB ACTER I ACE AE . 

RN EH 

RP SEQUENCE FROM N.A., AND PARTIAL SEQUENCE. 

RC STRAIN=K12J 

RM 88152492 

RA OLIVER G., GOSSET G. , SANCHEZ-PESCADOR R. , LOZOYA E., KU L.M., 



"TT^ — cr = rawn..»w ~ - - 

RL GENE 60:1-11(1987). 

RN [21 

RP DISCUSSION OF SEQUENCE. 

RA G0SS 8 E 8 T 5 L MERINO E.. RECILLAS F. r OLIVER G., BECERRIL B.. BOLIVAR F.; 

m PPmriN DATA ANAL. 259-16(1989). 

CC -!- CA?ALnic ACTIVITY: 2 L-GLUTAMATE + NADPO = L-GLUTAMINE ♦ 
rc 2-OXOGLUTARATE + NADPH. 

CC -'- PATHWAY: NITROGEN METABOLISM, GLUTAMATE BIOSYNTHESIS. 

CC ' THE CATALYZED REACTION BRINGS TOGETHER THE NITROGEN AND 

CC CARBON METABOLISM. TM 

CC -■- COFACTOR: iron-sulfur; FAD AND FMN FLAVOPROTEIN. 

C 5 -i- SUMMIT. AGGREGATE OF 4 CATALYTICAL ACTIVE HETERODIMERS, 

CC CONSISTING OF A LARGE AND A SMALL SUBUNIT. 

CC -'- GLUTAMINE BINDS TO THE LARGE SUBUNIT AND TRANSFERS THE AM I DO GROUP 
cl ' TO 2-0X0-GLUTAMATE THAT APPARENTLY BINDS TO THE SMALL SUBUNIT. 

CC -!- SIMILARITY: TO OTHER GLUTAMATE SYNTHASES. 

DR EMBLJ Ml 8747; ECGLTB. 

DR PIR; A29617; A29617. 

DR ECOGENE; EG104035 GLTB. 

kw oxidoreductase; iron-sulfur; flavoprotein; fad, fmn, 

kw glutamate biosynthesis, signal. 

FT CHAIN 1 43 1514 GLUTAMATE SYNTHASE LARGE CHAIN. 

FT NP BIND 1077 1134 FMN (BY SIMILARITY) . 

SQ SEQUENCE 1514 AA? 166225 MW; 11266241 CN5 

DB 3i Score 74; Match 53.87.; Predicted No. l;^-Oi; 

Matches 7; Conservative 5; Mismatches 1! Indels 0, Gaps 

Db 1314 velyltgdandyv 1326 

I '1*1 1 1 ;j5 ll 
Qy 4 VDLFLTGTPDEYV 16 

RESULT 4 

ID H583.TRYBB STANDARD; PRT; 703 AA. 

AC P12861J 

DT 01-0CT-1989 !REL. 12, CREATED) 

DT 01-0CT-1989 (REL. 12, LAST SEQUENCE UPDATE) 

DT G1-0CT-1993 (REL. 27, LAST ANNOTATION UPDATE) 

DE HEAT SHOCK PROTEIN S3. 

GN HSP83. 

H T E 5^r ^^AfsAR^ASTICOPHORA, MASTIGOPHORA; KlNETOPLASTIDAi 

OC TRYPANOSOMATIDAE. 

RN III 

RP SEQUENCE FROM N.A. 

RC STRAIN=LSTAR SERODEME; 

RM 90136708 

RA MOTTRAM J. > MURPHY W., AGABIAN N. ; 

RL MOL . BIOCHEM. PARASITOL. 37:115-128(1989). 

CC -'- FUNCTION: MOLECULAR CHAPERONE. HAS ATPASE ACTIVITY. 

CC -'- SUBCELLULAR LOCATION: CYTOPLASMIC . 

CC -!- SIMILARITY: BELONGS TO THE HEAT SHOCK PROTEIN HSP90 FAMILY. 

DR EMBL; X 14176! TBHSP83. 

DR PIR; S08119; S08119. 

DR PROSITEJ PS00298; HSP90. 

KW CHAPERONE; ATP-BINDING; HEAT SHOCK. 

SQ SEQUENCE 703 AA; 80715 MW; 2466880 CN? 

DB 3i Score 69; Match 41.77.; Predicted No. 9.34e-01 J 

Matches 10; Conservative 7; Mismatches 5; Indels 2, Gaps 



488 rrgpievlfntdpideyvnqqvkdf 511 



1 KRDVD-LFLTGTPDEYV-EQVAQY 22 



RESULT 5 

ID HS90_LEID0 STANDARD; PRT; 452 AA. 

AC P27890J 

DT 01-AUG-1992 (REL. 23, CREATED) 

DT 01-AUG-1992 (REL. 23, LAST SEQUENCE UPDATE) 

DT 01-0CT-1993 (REL. 27, LAST ANNOTATION UPDATE) 

DE HEAT SHOCK PROTEIN 90 (HSP 90) (FRAGMENT). 

GN HSP90. 

II EUKARYOTA; JSKzOM SARCOMASTIGOPHORA; MASTIGOPHORA, KINETOPLASTIDA; 

OC TRYPANOSOMATIDAE. 

RN cn 

RP SEQUENCE FROM N.A. 

RC STRAIN=SUDAN SI! 

RM 92165942 _ „_,, u , 

RA DE ANDRADE C.R.. KIRCHHOFF L.V., DONELSON J.E., OTSU K.. 

RL J. CLIN. MICROBIOL. 30:330-335(1992). 

CC -!- FUNCTION < MOLECULAR CHAPERONE. HAS ATPASE ACTIVITY 

cc -!- similarity; Belongs to the heat shock protein hsp90 family. 

DR EMBL; M73492! LDHSP90. 

DR PIR; A44888; A44888. 

DR PROSITE: PS00298; HSP90. 

KW CHAPERONE; ATP-BINDING! HEAT SHOCK. 

FT NON TER 1 1 

SQ SEQUENCE 452 AA; 52691 MHi 1061521 CN; 

DB 3! Score 67; Hatch 41.7XS Predicted No. 1.94e+00; 

Matches 10; Conservative 7; Mismatches 5; Indels 2, Gaps d, 

Db 237 rrglevlfntepideyvnqqvkdf 260 

if" MM MM Ml :i 
Qy 1 KRDVD-LFLTGTPDEYV-EQVAQY 22 



RESULT 6 

ID XYLZ.PSEPU STANDARD; PRT; 336 AA. 

AC P23101! 

DT 01-N0V-1991 (REL. 20, CREATED) 

DT 01-N0V-1991 (REL. 20, LAST SEQUENCE UPDATE) 

DT 01-AUG-1992 (REL. 23, LAST ANNOTATION UPDATE) 

DE TOLUATE 1 ,2-DIOXYGENASE ELECTRON TRANSFER COMPONENT (CONTAINS: 

DE FERREDOXIN AND FERREDOXIN — NAD (+) REDUCTASE (EC 1.18.1.3)). 

GN XYLZ. 

OS PSEUDOMONAS PUTIDA. 

Or PLASMID TOL PWWO. 

oc prokaryota; gracilicutes; scotobacteria; aerobic rods and cocci; 

oc pseudomonadaceae. 

RN [13 

RP SEQUENCE FROM N.A. 

RM 92041666 , , 

RA HARAYAMA S., REKIK M., BAIROCH A., NEIDLE E.L., ORNSTON L.N., 

RL J BACTERIOL. 173:7540-7548(1991). 

cc _i. FUNCTION; ELECTRON TRANSFER COMPONENT OF TOLUATE 1 ,2-DIOXYGENASE 

CC SYSTEM 

CC SUBUNIT: THE DIOXYGENEASE COMPLEX IS COMPOSED OF AN HYDROXYLASE 

CC ' COMPONENT THAT CONSISTS OF TWO CHAINS (XYLX AND XYLY) , AND AN 

CC ELECTRON TRANSFER COMPONENT (XYLZ). 

CC -!- CATALYTIC ACTIVITY: REDUCED FERREDOXIN + NAD (+) = OXIDIZED 

CC -'- SIHILARITY! + IN A THE N-TERMINAL REGION WITH 2FE-2S FERREDOXINS, AND 

CC IN THE REST OF THE SEQUENCE WITH FERREDOXIN REDUCTASE. 

«~ _l_ CTM1I A"'" ,M '! e-orM,->_ Tn_OCM" "rr Wl fi 



DR EMBL; M64747; PPXYL. 

DR PIRf C41659; C41659. 

DR PIR; S234S4; S23484. 

DR PRQSITE, PS00197J 2FE2S FERREDOXIN. 

KM AROMATIC HYDROCARBONS CATABOLISM; FLAVOPROTEIN; OXIDOREDUCTASE } 

KW FAD; NAD? IRON-SULFUR; PLASMID. 

FT DOMAIN 29 98 FERREDOXIN. 

FT DOMAIN 99 336 FERREDQXIN-REDUCTASE. 

FT METAL 40 40 IRON-SULFUR (2FE-2S) (BY SIMILARITY). 

FT METAL 45 45 IRON-SULFUR (2FE-2S) (BY SIMILARITY). 

FT METAL 48 48 IRON-SULFUR (2FE-2S) (BY SIMILARITY). 

FJ UetaL 81 81 IRON-SULFUR (2FE-2S5 (BY SIMILARITY). 

SQ SEQUENCE 336 AA; 36220 MM; 575585 CN; 

DB 7; Score 66; Match 50.07.; Predicted No. 2.73e+G0; 

Matches 10; Conservative 4; Mismatches 5? Indels 1; Gaps 

Db 298 evdiylcgpppn-veavsqy 316 

MPM I I M I'll 
Qy 3 DVDLFLTGTPDEYVE9VASY 22 



RESULT 7 

ID DP3A_EC0LI STANDARD; PRT; 1160 AA. 

AC P10443; 

DT 01-MAR-1989 (REL. 10. CREATED) 

DT 01-MAR-1989 (REL . 10, LAST SEQUENCE UPDATE) 

DT 01-JUN-1994 (REL. 29, LAST ANNOTATION UPDATE) 

DE DNA POLYMERASE III, ALPHA CHAIN (EC 2.7.7.7). 

GN DNAE OR POLC. 

II pKyOTA? GRACILICUTES; SCOTOBACTERIA; FACULTATIVELY ANAEROBIC RODS; 

OC ENTEROBACTERIACEAE. 

RN C13 

RP SEQUENCE FROM N.A. 

RM 88058791 

RA TQMASIEWICZ H.G., MCHENRY C.S.; 

RL J. BACTERIOL. 169:5735-5744(1987). 

RN C2] 

RP SEQUENCE OF 1070-1160 FROM N.A. 

RM 93123150 

RA LI S.J., CROMAN J.E. JR. • 

RL J. BACTERIOL . 175:332-340(1993). 

RN C3 3 

RP REVIEW. 

RM 92246902 

RA O'DONNELL M. ; 

RL BIOESSAYS 14:105-111(1992). 

RN [43 

RP MUTAGENESIS. 

RM 93387658 

RA FIJALKOWSKA I.J., SCHAAPER R.M.? 

RL GENETICS 134:1039-1044(1993). 

CC -'- FUNCTION: DNA POLYMERASE III IS A COMPLEX, MULTICHAIN ENZYME 

CC ' RESPONSIBLE FOR MOST OF THE REPLICATIVE SYNTHESIS IN BACTERIA. 

CC THIS DNA POLYMERASE ALSO EXHIBITS 3' TO 5' EXONUCLEASE ACTIVITY. 

PC THE ALPHA CHAIN IS THE DNA POLYMERASE. 

CC -!- CATALYTIC ACTIVITY: N DEOXYNUCLEOSIDE TRIPHOSPHATE = 

CC N PYROPHOSPHATE + DNA(N). rnn „ nu AM _ TUCT . 

PC -'- SUBUNIT: CONTAINS A CORE (COMPOSED OF ALPHA. EPSILON, AND THETA 

CC " CHAINS) THAT ASSOCIATES WITH A TAU SUBUNIT WHICH ALLOW THE CORE 

CC DIMERIZAZION TO FORM THE POLIII' COMPLEX. POLIII' ASSOCIATES WITH 

CC THE GAMMA COMPLEX (COMPOSED OF CHAINS GAMMA, DELTA, DELTA', PSI. 

f C AND CHI) AND WITH THE BETA CHAIN. THE FINAL COMPOSITION OF THE 

CC COMPLEX IS: (ALPHA, EPSILON, THETA) [ 2 3-TAU[2 3- (GAMMA, DELTA, DELTA . 



DR EMBL! M19334? ECLPXA. 

DR EMBL; S52931; S52931. 

DR PIR? C28390; DJEC3A. 

DR PIRf A40637; A40637. 

DR ECOGENE! EG1023S; DNAE. 

KW DNA-DIRECTED DNA POLYMERASE! DNA REPLICATION. 

SQ SEQUENCE 1160 AA; 129904 MU? 6441060 CN? 



DB 2! Score 66? Match 35.0X1 Predicted No. 2. 73e+00; 

itches 7? Conservative 81 Mismatches 4; Indels II Gaps 1? 



Db 957 Iglyltghpinqylkeiery 976 

: |:||| | :?p :! : i 
Qy 4 VDLFLTGTP-DEYVEQVAQY 22 



RESULT 8 A A 

ID ODQl.HUMAN STANDARD; PRTJ 1003 AA. 

AC Q02218! 

DT 01--JUL-1993 (REL. 26, CREATED) 

DT Gl-JUL-1993 (REL. 26, LAST SEQUENCE UPDATE) 

DT 01-0rT-1993 (REL. 27, LAST ANNOTATION UPDATE) 

DE 2-0X0GLUTARATE DEHYDROGENASE El COMPONENT PRECURSOR (EC 1.^.4.2) 

DE (ALPHA-KETOGLUTARATE DEHYDROGENASE). 

II EUKARYOTM^HETAZOA^' CHORDATA ! VERTEBRATA! TETRAPODA! MAMMALIA! 

OC EUTHERIAf PRIMATES. 

RN t n 

RP SEQUENCE FROM N.A. 

RC TISSUE=LIVER! 

RM 92179301 

RA KOIKE K.» URATA ¥., GOTO S.? „,„,, nM1 

m PRnr NATL ACAD. SCI. U.S.A. 89 ; 1963-1967 ( 1992) . 

PC FUNCTION^ THE 2-QXQGLUTARATE DEHYDROGENASE COMPLEX CATALYZES THE 

CC ' OVERALL CONVERSION OF 2-OXOGLUTARATE TO SUCCINYL-COA k C0(2). IT 

55 CONTAINS MULTIPLE COPIES OF 3 ENZYMATIC COMPONENTS: 2-QXQGLUTARATE 

CC DEHYDROGENASE (El), DIHYDROLIPOAMIDE SUCCINYLTRANSFERASE (E2) AND 

CC LIPOAMIDE DEHYDROGENASE (E3) . 

CC -!- CATALYTIC ACTIVITY : 2-OXOGLUTARATE + LIPOAMIDE = S-SUCCINYL- 

CC DIHYDROLIPOAMIDE + C0(2). 

CC -'- COFACTOR! THIAMINE PYROPHOSPHATE. 

CC ENZYME REGULATION? CATAEOLITE REPRESSED. 

CC -!- SUBCELLULAR LOCATION: MITOCHONDRIAL MATRIX. 

DR EMBL? D10523! HS20GDH. 

DR PIR! A38234J A38234. 

2 E?JS'o1!^!£t£. FLAVOPROTEINf THIAMINE PYROPHOSPHATE; 

KW MITOCHONDRION; TRANSIT PEPTIDE. 

ft TRANSIT 1 40 MITOCHONDRION. 

FT CHAIN 41 1003 ALPHA-KETOGLUTARATE DEHYDROGENASE . 

SO SEQUENCE 1003 AA; 113239 MW; 5162887 CN; 

DB 5; Score 66; Match 46. 7X? Predicted No. 2.78e+Q0; 

Matches 7; Conservative 6; Mismatches 2; Indels 0, Gaps 0, 

Db 47 epf Ugtssnyveen 61 

: | PIP Mil" 

Qy 5 DLFLTGTPDEYVEQV 19 



RESULT 9 

ID DP3A.SALTY STANDARD; PRT? 1160 AA. 

AC PI 4567? 

DT Oi-JAN-1990 (REL. 13, CREATED) 

DT OI-JAN-1990 'REL . 13, LAST SEQUENCE UPDATE) 

"I noi ' DCt_ " 



DE DNA PQLYMERASE I 1 1 , ALPHA CHAIN (EC 2.7.7.7). 

GN DNAE OR POLC. 

H 1^:™Z™US, SCQTOBACTERIA; FACUUAT.VEUV ANAEROBIC RODS, 

OC ENTEROBACTERIACEAE. 

RN Cil 

RP SEQUENCE FROM N.A. 

RM 90008797 

RA LANCY E.D.. LIFSICS M.R., MUNSQN P., MAURER R.i 
pi I RAfTERIOL . 171:5581-5586(1939). 

rr -\- fS ON DNA POLYMERASE III IS A COMPLEX. MULTICHAIN ENZYME 
CC ' RESPONSIBLE FOR HOST OF THE REPLICATIVE SYNTHESIS IN BACTERIA. 
CC THIS DNA POLYMERASE ALSO EXHIBITS 3' TO 5' EXONUCLEASE ACTIVITY . 

C r THE ALPHA CHAIN IS THE DNA POLYMERASE . 

Cc CATALYTIC ACTIVITY: N DEOXYNUCLEQSIDE TRIPHOSPHATE = 

5 C - iSr™ 8 cS? T MH; aTre' (COMPOSED OF ALPHA > EPSILON, AND THETA 
55 ' CHAINS THAT ASSOCIATES WITH A TAU SUBUNIT WHICH ALLOW THE CORE 
CC DIMERIZAIION TO FORM THE POLIII' COMPLEX. POLIII' ASSOCIATES WITH 

CC THE GAMMA COMPLEX (COMPOSED OF CHAINS GAMMA. DELTA, DELTA', PSI . 

CC AND CHI ) AND WITH THE BETA CHAIN. THE FINAL COMPOSITION OF THE 

CC COMPLEX IS: ( ALPHA. EPSILON. THETA) L 2 ]-TAUt 2 1- (GAMMA .DELTA, DELTA . 

CC PS I .CHI ) 1 2 3-BETAL 4 ]. 

DR EMBL? M29701; STDNAE. 
DR EMBL; M26046; STP0L3A. 
DR PIRi A45915J A45915. 

DNA-DIRECTED DNA POLYMERASE; DNA REPLICATION. 
SEQUENCE 1160 AA; 130118 MWJ 6471246 CN5 



KW 
SG 



2! Score 66? Match 35. 07. J Predicted No. 2.78e+00! 



CB 
Matches 



7; Conservative 8! Mismatches 4; Indels 15 Gaps l; 



Db 957 Iglyltghpinqylkeiery 976 

: |:||| | ::p :: 5 I 
Qy 4 VDLFLTGTP-DEYVEQVAQY 22 



RESULT 10 „ T <o« A A 

ID AGRI.CHICK STANDARD f PRT5 1955 AA. 

AC P31696; 

DT 01-JUL-1993 (REL. 26, CREATED) 

DT Ql-JUL-1993 (REL. 26, LAST SEQUENCE UPDATE) 

DT 01-JUL-1993 (REL. 26, LAST ANNOTATION UPDATE) 

DE AGRIN PRECURSOR. 

II ^"•TetazoTcLdata; vertebrata; TETR.ROOA, AVES! neqgnathae; 

OC GALLIFORMES. 

rn m 

RP SEQUENCE FROM N.A. 

RC TISSUE=BRAIN! 

rS TSImTw.K., RUEGG M.A., ESCHER G., KROEGER S., MCMAHAN U.J.; 

RL NEURON 3:677-689(1992). 

RN C2] 

RP ALTERNATIVE SPLICING. 

RM 92232H98 

RA RUEGG M.A., TSIM K.W.K., HORTON S.E., KROEGER S., ESCHER G.. 

RA GENSCH E.M., MCMAHAN U.J. 5 

RL NEURON 8:691-699(1992). 

CC -'- FUNCTION: COMPONENT OF THE BASAL LAMINA THAT CAUSES THE 
CC ' AGGREGATION OF ACETYLCHOLINE RECEPTORS AND ACETYLCHOLINE-ESTERASE 
CC ON THE SURFACE OF MUSCLE FIBERS OF THE NEUROMUSCULAR JUNCTION. 

CC -!- SUB^LLULAR LOCATION: SYNAPTIC BASAL LAMINA AT THE NEUROMUSCULAR 

rr JUNCTION 

CC ALTERNATIVE PRODUCTS: AT LEASTTHREE DIFFERENT FORMS ARISEBY^^ 



cc 

DR 
DR 
DR 
DR 
DR 
KU 



CLUSTERING ACTIVITY. 
EMBLJ M94271! GGAGRIN. 
EMBLJ M97371! GGAGRPR1A. 
EKBLJ M97372; GGAGRPR2A. 
PIR? JH0591? AGCH. 

prqsite; PS00022; egf. 
glycoprotein; egf-like DOMAI 



n; repeat; alternative splicing; signal. 



FT 


SIGNAL 


1 


38 


POTENTIAL. 


FT 


CHAIN 


39 


1955 


AGRIN. 


FT 


DOMAIN 


54 


126 


KAZAL-LIKE. 


FT 


DOMAIN 


130 


201 


KAZAL-LIKE. 


FT 


DOMAIN 


202 


273 


KAZAL-LIKE. 


FT 


DOMAIN 


276 


344 


KAZAL-LIKE. 


FT 


DOMAIN 


350 


418 


KAZAL-LIKE. 


FT 


DOMAIN 


419 


483 


KAZAL-LIKE. 


FT 


DOMAIN 


484 


548 


KAZAL-LIKE. 


FT 


DOMAIN 


551 


633 


KAZAL-LIKE. 


FT 


SIMILAR 


687 


793 


LAMININ DOMAI 


FT 


REPEAT 


688 


739 




FT 


REPEAT 


742 


786 




FT 


DOMAIN 


781 


851 


KAZAL-LIKE. 


FT 


DOMAIN 


1233 


1751 


4 X EGF-TYPE 


FT 


REPEAT 


1233 


1264 


EGF-LIKE 1. 


FT 


REPEAT 


1450 


1482 


EGF-LIKE 2. 


FT 


REPEAT 


1489 


1521 


EGF-LIKE 3. 


FT 


REPEAT 


1718 


1751 


EGF-LIKE 4. 


FT 


DOMAIN 


856 


995 


SER/THR-RICH 


FT 


DOMAIN 


1150 


1219 


SER/THR-RICH 


FT 


VARSPLIC 


1648 


1651 


MISSING (IN 


FT 


VARSPLIC 


1784 


1793 


MISSING (IN 


FT 








2) . 


FT 


DISULFID 


86 


105 


POTENTIAL. 


FT 


DISULFID 


94 


126 


POTENTIAL. 


FT 


DISULFID 


160 


180 


POTENTIAL. 


FT 


DISULFID 


169 


201 


POTENTIAL. 


FT 


DISULFID 


233 


252 


POTENTIAL. 


FT 


DISULFID 


241 


273 


POTENTIAL. 


FT 


DISULFID 


304 


323 


POTENTIAL. 


FT 


DISULFID 


312 


344 


POTENTIAL. 


FT 


DISULFID 


378 


397 


POTENTIAL. 


FT 


DISULFID 


386 


418 


POTENTIAL. 


FT 


DISULFID 


443 


462 


POTENTIAL . 


FT 


DISULFID 


451 


483 


POTENTIAL. 


FT 


DISULFID 


507 


527 


POTENTIAL. 


FT 


DISULFID 


516 


548 


POTENTIAL. 


FT 


DISULFID 


592 


612 


POTENTIAL. 


FT 


DISULFID 


601 


633 


POTENTIAL. 


FT 


DISULFID 


810 


830 


POTENTIAL. 


FT 


DISULFID 


819 


851 


POTENTIAL. 


FT 


DISULFID 


1233 


1244 


POTENTIAL. 


FT 


DISULFID 


1238 


1253 


POTENTIAL. 


FT 


DISULFID 


1255 


1264 


POTENTIAL. 


FT 


DISULFID 


1450 


1461 


POTENTIAL. 


FT 


DISULFID 


1455 


1471 


POTENTIAL. 


FT 


DISULFID 


1473 


1482 


POTENTIAL. 


FT 


DISULFID 


1439 


1500 


POTENTIAL. 


FT 


DISULFID 


1494 


1510 


POTENTIAL. 


FT 


DISULFID 


1512 


1521 


POTENTIAL. 


FT 


CARBOHYD 


390 


390 


POTENTIAL. 


FT 


CARBOHYD 


659 


659 


POTENTIAL. 


FT 


CARBOHYD 


764 


764 


POTENTIAL. 


FT 


CARBOHYD 


814 


814 


POTENTIAL. 


SO 


SEQUENCE 


1955 


AA; 211411 


MWf 1742620 



AND 



DB l; Score 



65; Match 40.07,; Predicted No. 3.97e+00; 



Db 1387 dtdlfvggapedqmavvaertaatv 141 i 

I III* PI" ' IP I I 
Qy 3 DVDLFLTGTFDEYVEQVAGYKALPV 27 



RESULT 11 , (!<r AA 

ID ANX1J10USE STANDARD; PRT; 345 AA. 

AC P10107; 

DT Gl-MAR-1989 (REL. 10. CREATED) 

DT 01-MAR-1989 (REL. 10, LAST SEQUENCE UPDATE) 

DT 01-0CT-1994 (REL. 30, LAST ANNOTATION UPDATE) 

DE ANNEXIN I (LIPOCORTIN I! (CALPACTIN II) (CHRQMQBINDIN 9) (P35) 

DE (PHOSPHQLIPASE A2 INHIBITORY PROTEIN) . 

GN LPC-1. 

OC EUKARYOTaTmETAZQA? CHORDATA ; VERTEBRATA; TETRAPODAi MAMMALIA ! 

oc eutheria; rodentia. 

RN [1] 

rp sequence from n.a. 

rc strain=dsj 

DM °909S333 

RA SAKATA T., IWAGAMI S. , TSURUTA Y.. SUZUKI R.. HOJO K., SATO K.. 

RA TERAOKA H.J 

RL NUCLEIC ACIDS RES. 16:11818-11818(1988). 

RN C23 

RP SEQUENCE OF 5-345 FROM N.A. 

RA PHILIPPS C. ROSE-JOHN S.. RINCKE C. FUERSTENBERGER G., MARKS F 

Rl BIQOHEM. BIOPHYS. RES. COMMUN. 159:155-162(1989). 
?C FUNCTION: THIS PROTEIN REGULATES PHQSPHOLIPASE A2 ACTIVITY. IT 

CC ' SEEMS TO BIND FROM TWO TO FOUR CALCIUM IONS WITH HIGH AFFINITY. 
C C PTM ; PHOSPHORYLATION OF ANNEXIN 1 RESULTS IN LOSS OF ITS 

CC INHIBITORY ACTIVITY. 

CC -!- DOMAIN: CONTAINS FOUR HOMOLOGOUS REPEATS WITH A CONSENSUS 

PC SEQUENCE COMMON TO ALL ANNEXIN PROTEINS. A PAIR OF THESE REPEATS 

CC MAY FORM ONE BINDING SITE FOR CALCIUM AND PHOSPHOLIPID. 

CC -!- SIMILARITY: TO OTHER PROTEINS OF THE ANNEXIN FAMILY. 

DR EMBL5 X07486! MMLCIR. 

DR EMBL5 M24554! MMLCI. 

DR PIRJ S02181J LUMS1. 

DR PROSITE; PS00223! ANNEXIN. 

KW ANNEXIN; CALCIUM/PHOSPHOLIPID-BINDING; REPEAT; 
KM PHOSPHQLIPASE A2 INHIBITOR; PHOSPHORYLATION. 



FT 
FT 
FT 



» t 



FT REPEAT 206 266 ANNEXIN. 

FT REPEAT 281 341 ANNEXIN. 

FT MOD RES 20 20 PHOSPHORYLATION (BY TYR-KINASES) . 



INIT.MET 


0 


0 


REPEAT 


50 


110 


REPEAT 


122 


182 


REPEAT 


206 


266 


REPEAT 


281 


341 


MOD_RES 


20 


20 


CONFLICT 


77 


78 


CONFLICT 


221 


221 


CONFLICT 


273 


273 


SEQUENCE 


345 


AA; 38603 


DB 15 Score 




655 Mate 


Matches 9 


! Conservative 



ANNEXIN. 
ANNEX IN. 



FT CONFLICT 77 78 QS -> PR HN REF. 2) 

FT CONFLICT 221 221 T -> H (IN REF. 2) 

FT CONFLICT 273 273 T -> H (IN REF. 2) 

SQ SEQUENCE 345 AA; 38603 MW; 569605 CN; 



5.07.; Predicted No. 3.97e+005 
35 Mismatches 8; Indels 0; Gaps 



Db 12 flenqeqeyvqavksykggp 31 

II 'IIP ! IP I 
Qy 7 FLTGTPDEYVEQVAQYKALP 26 



RESULT 12 

ID RYNC.RABIT STANDARD; PRTJ 4969 AA. 



DT 
DT 
DT 
DE 
OS 
OC 
QC 
RN 
RP 
RC 
RM 
RA 
RA 
RL 
RN 
RP 
RM 
RA 
RL 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
DR 
DR 
KW 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 



KHANNA V.K.. ZORZATO F., GREEN N.M. 



CEFALI D.C.r JONES L.R. 



01-JUL-1993 (REL. 26. CREATED) 
01-JUL-1993 (REL. 26, LAST SEQUENCE UPDATE) 
Gl-JUL-1993 (REL. 26, LAST ANNOTATION UPDATE) 

ryanodine receptor, cardiac muscle, 
qryctqlagus cuniculus (rabbit) . 

eukaryota; metazoa; chordata; vertebrata; tetrapoda; mammalia; 
eutheria; lagomorpha. 

C 1 ] 

SEQUENCE FROM N. A. 
TISSUE=CARDIAC MUSCLE? 
90337947 

OTSU K. , WILLARD H.F. 
HACLENNAN D.H. f 

J. BIOL. CHEM. 265:13472-13483(1990). 
[23 

PHOSPHORYLATION OF SER-2B09. 
91250425 

WITCHER D.R., KOVACS R.J.i SCHULMAN H., 

- - B ^T^•~«c:?«i 1 S^^Si , iRAHBVERBE-TUBULES AND SARCOPLAMIC 
' RPTTCULUM. CONTRACTION OF CARDIAC MUSCLE IS TRIGGERED BY RELEASE 
OF CA++ FROM SR FOLLOWING DEPOLARIZATION OF T-TUBULES. 
-!- THE CALCIUM RELEASE CHANNEL IS MODULATED BY CA++, MG++. ATP, AND 

-■- TOe"cALCIUM RELEASE CHANNEL ACTIVITY RESIDES IN THE C-TERHINAL 
' REGION WHILE THE REMAINING PART OF THE PROTEIN CONSITUTES THE 
'FOOT' STRUCTURE SPANNING THE JUNCTIONAL GAP BETWEEN THE SR AND 
THE T-TUBULE. IT IS POSSIBLE THAT THE FOOT STRUCTURE INTERACTS 
WITH THE CYTOPLASMIC REGION OF THE DIHYDROPYRIDINE RECEPTOR. 
RYANODINE IS AN ALKALOID THAT BINDS TO THE CA-RELEASE CHANNEL IN 
JUNCTIONAL SR AND MODULATES ITS ACTIVITY. 

-i- SUBUNIT; HOMOTETRAMER (POTENTIAL). 

-'- TISSUE SPECIFICITY? HEART AND BRAIN. 

-!- SIMILARITY; LOCAL & LOW WITH THE NICOTINIC ACETYLCHOLINE RECEPTOR 

(N-ACHR) SUBUNITS. 
EMBLf M59743! 0CCA2RE. 

receptor! Transmembrane; ionic channel; calcium channel; repeat; 

phosphorylation, glycoprotein. 

cytoplasmic, 
(potential) . 
(potential) . 
(potential) . 
(potential) . 
(potential) . 
(potential) . 
(potential) . 
(potential) . 
(potential), 
(potential) . 
(potential) . 
(potential) . 
approximate repeats. 



DOMAIN 
TRANSMEM 
TRANSMEM 
TRANSMEM 
TRANSMEM 
TRANSMEM 
TRANSMEM 
TRANSMEM 
TRANSMEM 
TRANSMEM 
TRANSMEM 
TRANSMEM 
TRANSMEM 
DOMAIN 
REPEAT 
REPEAT 
REPEAT 
REPEAT 
BINDING 
BINDING 
BINDING 
BINDING 
MOD RES 
CARBOHYD 
CARBOHYD 
CARBOHYD 



1 

3091 
3154 
3941 
3979 
4234 
4295 
4501 
45B0 
4722 
4770 
4812 
4847 
853 
853 
967 
2693 
2813 
2619 
2775 
2877 
2998 
2809 
198 
404 
1636 



3090 
3110 
3172 
3960 
3996 
4257 
4315 
4521 
4602 
4752 
4788 
4829 
4869 
2926 
966 
1080 
2811 
2926 
3016 
2807 
2898 
3016 
2809 
198 
404 
1636 



M' 
M' ' 
Ml 
M2 
M3 
M4 
M5 
M6 
M7 
MS 
M9 
M10 
4 X 
1. 



4. 

MODULATOR (POTENTIAL) . 
CALMODULIN (POTENTIAL) . 
CALMODULIN (POTENTIAL) . 
CALMODULIN (POTENTIAL) . 
PHOSPHORYLATION (BY CAM-KINASE) 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 

TCMTT ■ 1 



FT 
FT 
FT 
FT 
FT 
FT 
SQ 



CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
SEQUENCE 



2224 


2224 


POTENTIAL. 


2803 


2803 


POTENTIAL. 


2831 


2831 


POTENTIAL. 


3096 


3096 


POTENTIAL. 


4105 


4105 


POTENTIAL. 


4796 


4796 


POTENTIAL. 



4969 AA; 565060 MW; 24964830 CNi 



DB 6; Score 65; Match 38.17.; Predicted No. 3.97e+00; 

Matches 8; Conservative 3; Misnatches 10; Indels 0; Gaps 

Db 3616 ravnlf Iqgyeksuieteehy 3636 

! Mill I : l 5 1 
Qy 2 RDVDLFLTGTPDEYVEQVAQY 22 



RESULT 13 * * 

ID GLSF.ANTSP STANDARD; PRT; 1536 AA. 

AC Q06434? 

DT 01-JUN-1994 (REL. 29, CREATED) 

DT 01-JUN-1994 (REL . 29, LAST SEQUENCE UPDATE) 

DT 01-0CT-1994 (REL. 30, LAST ANNOTATION UPDATE) 

DE FERREDOX IN-DEPENDENT GLUTAMATE SYNTHASE (EC 1.4.7.1) (FD-GOGAT) . 

GN GLTB OR GLSF. 

OS ANTITHAMNION SP. 

OG CHLOROPLAST. 

oc eukaryota; planta; phycophyta; RHODOPHYTA (RED ALGAE). 

rn cn 

RP SEGUENCE from n.a. 

RM 94033299 

RA VALENTIN K.U., KOSTRZEWA M. , ZETSCHE K. 5 

RL PLANT MOL . BIOL. 23:77-85(1993). 

CC -'- CATALYTIC ACTIVITY'. 2 L-GLUTAMATE + 2 OXIDIZED FERREDOXIN = 

CC L-GLUTAM1NE + 2-QXOGLUTARATE + 2 REDUCED FERREDOXIN. 

CC -'- COFACTQR: IRON-SULFUR; FAD AND FMN FLAVOPROTEIN. 

CC -!- PATHWAY: GLUTAMINE SYNTHETASE /GOGAT PATHWAY WHICH IS INVOLVED 

CC IN THE ASSIMILATION OF AMMONIA. 

CC -!- SUBUNIT: MONOMER. 

CC -'- SUBCELLULAR LOCATION; CHLOROPLAST STROMA. 

CC -!- SIMILARITY: TO OTHER GLUTAMATE SYNTHASES. 

DR EMBL; Z21705; CHASGLTB. 

DR PIR; S31911; S31911. 

KW OXIDOREDUCTASE! IRON-SULFUR! FLAVOPROTEIN", FAD; FMN; CHLOROPLAST; 

KW GLUTAMATE BIOSYNTHESIS. 

FT NP BIND 1105 1162 FMN (BY SIMILARITY). 

SQ SEQUENCE 1536 AA; 171111 MW; 12053975 CN; 

DB 35 Score 65; Match 33.37.! Predicted No. 3.97e+00; 

Matches 5; Conservative 6; Misnatches 4; Indels 0; Gaps 

Db 1340 kgihlylkgeandyv 1354 

: : IM I -Ml 

Qy 2 RDVDLFLTGTPDEYV 16 



RESULT 14 

ID VG67_HSVI1 STANDARD; PRT; 1556 AA. 

AC Q00107; 

DT 01-DEC-1992 (REL. 24, CREATED) 

DT 01-DEC-1992 (REL. 24, LAST SEQUENCE UPDATE) 

DT 01-DEC-1992 (REL. 24, LAST ANNOTATION UPDATE) 

DE HYPOTHETICAL GENE 67 PROTEIN. 

OS ICTALURID HERPESVIRUS 1 (CHANNEL CATFISH VIRUS) (CCV) . 



RN cn 

RP SEQUENCE FRQH N.A. 

RC STRAIN=AUBURN 1? 

RM 92087490 

RA DAVISON A.J.; 

RL VIROLOGY 186:9-14(1992). 

DR EMBL? M75136I HECHCCOMG. 

DR PIR? D36793J D36793. 

KM HYPOTHETICAL PROTEIN. 

SQ SEQUENCE 1556 AA? 173577 MW? 12789685 CN? 

DB 7; Score 64? Match 34.67.5 Predicted No. 5.64e+0 

Matches 9; Conservative 65 Mismatches 111 Indels 

Db 1201 ravesfnlrdparyivelapegslpv 1226 

I Pi' i I' :J I 
Qu 2 RDVDLFLTGTPDEYVEQVA6YKALPV 27 



RESULT 15 

ID SUIS.RAT STANDARD? PRT? 917 AA. 

AC P23739? 

DT 01-N0V-1991 (REL. 20, CREATED) 

DT 01-AUG-1992 (REL. 23, LAST SEQUENCE UPDATE) 

DT 01-FEB-1994 (REL. 28, LAST ANNOTATION UPDATE) 

DE SUCRASE-ISOMALTASE, INTESTINAL (EC 3.2.1.48) / (EC 3.2.1.10) 

DE (FRAGMENTS). 

OS RATTUS NORVEGICUS (RAT). 

OC EUKARYOTA? METAZOA? CHORDATA? VERTEBRATA? TETRAPODA ! MAMMALIA, 

OC EUTHERIA? RODENTIA. 

RN [H 

RP SEQUENCE OF 1-275 FROM N.A. 

RC STRAIN=FISHER 344? TISSUE=INTESTINE? 

RM 91097578 

RA TRABER P.G.? 

RL BIOCHEM. BIOPHYS. RES. COMMUN. 173:765-773(1990). 

RN 121 

RP SEQUENCE OF 276-917 FROM N.A. 

RC STRAIN=SPRAGUE-DAWLEY? TISSUE=DUODENUM? 

RM 90381315 

RA BROYART J. -P. , HUGOT J.-P. » PERRET C PORTEU A.? 

RL BIOCHIM. BIOPHYS. ACTA 1087:61-67(1990). 

CC -!- FUNCTION: PLAYS AN IMPORTANT ROLE IN THE FINAL STAGE OF 
CC CARBOHYDRATE DIGESTION. 

CC -'- CATALYTIC ACTIVITY: HYDROLYSIS OF SUCROSE AND MALTOSE BY AN 
CC ALPHA-D-GLUCOSIDASE-TYPE ACTION. ,„„„ inI „ , T „„ A rr C 

CC -'- CATALYTIC ACTIVITY: HYDROLYSIS OF I ,6-ALPHA-D-GLUCOSIDIC LINKAGES 
CC ' IN ISOMALTOSE AND DEXTRINS PRODUCED FROM STARCH AND GLYCOGEN BY 

CC ALPHA*~AhYLASE « 

CC -'- SUBCELLULAR LOCATION: TYPE II MEMBRANE PROTEIN. BRUSH BORDER. 

cr _i- p TM: THE PRECURSOR IS PROTEOLYTICALLY CLEAVED WHEN EXPOSED TO 

CC PANCREATIC PROTEASES IN THE INTESTINAL LUMEN. 

CC -'- SUBUNIT: THE RESULTING SUCRASE AND ISOMALTASE SUBUNITS STAY 

CC ' ASSOCIATED WITH ONE ANOTHER IN A COMPLEX BY NON-COVALENT LINKAGES. 

CC -'- THERE IS A HIGH DEGREE OF HOMOLOGY BETWEEN THE ISOMALTASE AND 

CC ' SUCRASE PORTIONS (41 7. OF AMINO ACID IDENTITY) INDICATING THAT 

CC THIS PROTEIN IS EVOLVED BY PARTIAL GENE DUPLICATION. 

CC -!- SIMILARITY: BELONGS TO FAMILY 31 OF GLYCOSYL HYDROLASES. 

DR EMBL? M62889? RRSI. 

DR PIR? SI 1386? S11386. 

DR PROSITE? PS00129? GLYC0SYL_HYDR0L_F31_1 . 
DR PROSITE ? PS0Q707? GLYCOSYL HYDR0L_F31_2. 

KW MULTIFUNCTIONAL ENZYME! INTESTINE? TRANSMEMBRANE? GLYCOPROTEIN? 
KW HYDROLASE? GLYCOSIDASE? DUPLICATION. 
FT NON TER 1 1 



FT 


NDN CONS 


275 


276 




FT 


DOMAIN 


<276 


557 


IS0HALTASE 


FT 


DOMAIN 


558 


>917 


SUCRASE. 


FT 


CARBOHYD 


23 


23 


POTENTIAL. 


FT 


CARBOHYD 


302 


302 


POTENTIAL. 


FT 


CARBOHYD 


309 


309 


POTENTIAL. 


FT 


CARBOHYD 


4i i 


411 


POTENTIAL. 


FT 


CARBOHYD 


454 


454 


POTENTIAL. 


FT 


CARBOHYD 


784 


784 


POTENTIAL. 


FT 


CARBOHYD 


852 


852 


POTENTIAL. 


FT 


CARBOHYD 


889 


889 


POTENTIAL. 


FT 


CARBOHYD 


903 


903 


POTENTIAL. 


FT 


NON TER 


917 


917 




SG 


SEQUENCE 


917 aa; 


105272 


nUr 4481141 



DB 6! Score 64; Match 33.3'/.; Predicted No. 5.64e+00; 

Matches 7; Conservative 8? Mismatches 6: Indels 0; Gaps o; 

Db 249 if Igdtpeqwqqyqefngrp 269 

Qy 6 LFLTGTPDEYVEGVAGYKALP 26 

Search completed: Fri Mar 24 07:41:16 1995 
Job time : 13 sees. 
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Release 2.0 John F. Collins & S. S. Sturrock, Biocomputing Research Unit. 
Copyright (c) 1993, 1994 by University of Edinburgh, U.K. 
Distribution rights by IntelliGenetics, Inc. 



MPsrch.pp protein - protein database search, using Smith-Waterman algorithm 
Run on: 



Fri Mar 24 07:44:01 1995; MasPar time 2.86 Seconds 

57.341 Million cell updates/sec 

Tabular output not generated. 



Title: 

Description: 
Perfect Score: 
Sequence: 

Scoring table: 



Searched: 
Database: 



>Ug^) < 8-3 I OjO-510-2, 

(1:27) from US08300510.pep 

184 

1 KALPVVLENARILKNCVDAKMTEEDKE 27 

P-AM 150 
Gap 14 

50375 seqs, 6065180 residues 

a^geneseqi 

1 a-genl 

2 a-gen2 

3 a-gen3 

4 a-gen4 

5 a-gen5 
b a-gen6 
7 a-gen7 



9 a-gen9 

10 a-genlO 



Statistics; 



Mean 21.324; Variance 75.358; scale 0.283 



Predicted No. is the number of results expected by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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TRFP chain 1 with lea 
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Peptide Y. 

TRFP chain 1 (with Le 
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TRFP Chain #1 with CI 
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Pig H4 isoenzyme. 
p!90 protein. 
0RF3 product fron the 
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ALIGNMENTS 

RESULT 1 

ID R41984 standard; Protein; 88 AA. 

A O ' < Oft A 1 



DT 21-APR-1994 (first entry) 

DE Hunan T cell reactive feline protein B chain I. 

KM Hunan; T cell; reactive; feline; protein; immune response; antigen; 

KU tolerance; manual; Dermatophagoides; Felis; Ambrosia! Lolium; Cams; 

KM Cryptomeria; Alternaria; Alder; Betula; Quercus! Olea; Artemesia; 

KM Plantago; Parietaria; Blattella; Apis; Periplaneta! autoantigen. 

OS Homo sapiens. 

PH Key Location/Qualifiers 

FT Peptide 1..17 

FT /note= "Signal peptide" 

FT Protein 18.. 88 

FT /note= "Mature protein" 

PN M09319178-A. 

PD 30-SEP-1993. 

PF 25-MAR-1993; U02462. 

PR 25-MAR-1992? US-857311. 

PR 15-MAY-1992; US-884718. 

PR 15-JAN-1993; US-006116. 

PA (IMMU-) IMMUNOLOGIC PHARM CORP. 

PI Briner TJ, Garman RD, Gefter ML, Greenstein JL, Kuo M! 

PI Morville M; 

DR WPI; 93-320744/40. 

DR N-PSDB; 049534. 

PT Neu peptide(s) for inducing tolerance - comprise one or more 

PT epitope(s) of an allergen administered subcutaneous ly r for 

PT treating sensitivity to cats, bees, etc. 

PS Disclosure; Fig 1? 107pp! English. 

CC The spquences given in R41983-84 represent chain 1 of human T cell 

CC rpactive fMine proteins (TRFP) A and B respectively. Peptides 

rc derived from TRFP may be used in a therapeutic composition which is 

CC useful in treating diseases which involve an immune response to a 

CC protein antigen. This composition may be used to induce tolerance 

CC in a mammal to Dermatophagoides , Felis, Ambrosia, Lolium, Cryptomeria, 

rc Alternaria, Alder, Betula, Guercus, Olea, Artemesia, Plantago, 

CC Parietaria, Canis, Blattella, Apis, Periplaneta and to autoanUgens 

CC in humans. 

SS Sequence 88 AA; 

DB 8! Score 184; Match 100.07.! Predicted No. 4.68e-12; 
Matches 27", Conservative 0! Mismatches 0; Indels 0; Gaps 

Db 47 kalpvvlenarilkncvdakmteedke 73 
1 1 1 MMMII Ml M M!M Mi IN 

Qy 1 KALPVVLENARILKNCVDAKMTEEDKE 27 



RESULT 2 

ID R41983 standard; Protein; 92 AA. 

AC R419S3; 

DT 21-APR-1994 (first entry) 

DE Human T cell reactive feline protein A chain 1. 

KM Hunan; T cell; reactive; feline; protein; immune response; antigen; 

KW tolerance; mammal; Dermatophagoides; Felisi Ambrosia; Lolium; Cams 

KM Cryptomeria; Alternaria; Alder; Betula; Quercus? Olea; Artemesia; 

KM Plantago; Parietaria; Blattella? Apis? Periplaneta? autoantigen. 

OS Homo sapiens. 

FH Key Location/Qualifiers 

PT Peptide 1..22 

PT /note= "Signal peptide" 

PT Protein 23.. 92 

FT /note= "Mature protein" 

PN M09319178-A. 

PD 30-SEP-1993. 

PF 25-MAR-1993? U02462. 

PR 25-MAR-1992? US-857311. 



PR 15-JAN-1993; US-006U6. 

PA (INHU-) IMMUNOLOGIC PHARM CORP. 

PI Briner TJ, Garman RD, Gefter ML, Greenstein JL, Kuo M*, 

PI Morville M; 

DR WPI; 93-320744/40. 

DR N-PSDB; Q49533. 

PT Neu peptide(s) for inducing tolerance - comprise one or wore 
PT epitope(s) of an allergen administered subcutaneously, for 
PT treating sensitivity to cats, bees, etc. 
PS Disclosure; Fig 1'- 107pp? English. 

CC The s*qu*nces given in R41983-84 represent chain 1 of human T cell 
reactive feline proteins (TRFP) A and B respectively. Peptides 
derived from TRFP nay be used in a therapeutic composition uhich is 
useful in treating diseases uhich involve an immune response to a 
protein antigen. This composition may be used to induce tolerance 
1( , in a mammal to Dermatophagoides, Felis, Ambrosia, Lolium, Cryptomeria, 
CC Alt^rnaria, Alder, Betula, Quercus, Olea, Artemesia, Plantago, 
CC Parietaria, Canis, Blattella, Apis, Periplaneta and to autoantigens 
CC in humans. 
SQ Sequence 92 AA ; 

DB 8! Score 184? Match 100.07.; Predicted No. 4.68e-12; 



CC 
CC 
CC 
CC 
CC 



Matches 



27; Conservative 0; Mismatches 0; Indels 0; Gaps 



Db 51 kalpvvlenarilkncvdakmteedke 77 

MIIIIIH MMi IN Mil M! Ill 
Qy 1 KALPVVLENAR ILKNCVDAKMTEEDKE 27 



RESULT 3 

ID R41976 standard; peptide; 27 AA. 

AC R41976? 

DT 21-APR-1994 (first entry) 

DE Human T cell reactive feline protein fragment Y. 

KM Human; T cell; reactive; feline; protein; immune response; antigen; 

KW tolerance; mammal; Dermatophagoides; Felis; Ambrosia; Lolium; Cams; 

KW Cryptomeria! AlternariaS Alder; Betula; Quercus; Olea; Artemesia; 

KW Plantago; Parietaria; Blattella; Apis; Periplaneta; autoantigen; ss. 

OS Homo sapiens. 

PN W09319178-A. 

PD 30-SEP-1993. 

PF 25-MAR-1993J UQ2462. 

PR 25-MAR-1992; US-857311. 

PR 15-MAY-1992; US-884718. 

PR 15-JAN-1993; US-006116. 

PA (IMMU-) IMMUNOLOGIC PHARM CORP. 

PI Briner TJ, Garman RD, Gefter ML, Greenstein JL; 

PI Kuo M, Morville M; 

DR WPI J 93-320744/40. 

PT Neu peptide(s) for inducing tolerance - comprise one or more 

PT epitope(s) of an allergen administered subcutaneous ly, for 
PT treating sensitivity to cats, bees, etc. 
PS Claim l; Fig 3; 107pp; English. 

CC The sequences given in R41975-82 are peptides derived from a human T 
CC cell reactive feline protein. These peptides are used in a 
CC therapeutic composition uhich is useful in treating diseases which 
CC involve an immune response to a protein antigen. This composition 
rc may bp used to induce tolerance in a mammal to Dermatophagoides, 
CC Felis, Ambrosia, Lolium, Cryptomeria, Alternaria, Alder, Betula, 
CC Quercus, Olea, Artemesia, Plantago, Parietaria, Cams, Blattella, 
CC Apis, Periplaneta and to autoantigens in humans. 
SQ Sequence 27 AA; 

DB 8! Score 184; Match 100. OX; Predicted No. 4.68e-l2; 
Matches 27; Conservative 0; Mismatches 0; Indels 0; Gaps 



Db 1 kalpvvlenarilkncvdaknteedke 27 

1 1 1 1 1 1 1 ! I f 1 i 1 1 1 i t i 1 1 ! f 1 1 1 1 1 

Qu 1 KALPVVLENARILKNCVDAKMTEEDKE 27 



RESULT 4 

ID R36548 standard; Protein? 96 AA. 

AC R36548; 

DT 12-AUG-1993 (first entry) 

DE Recombitope YZX. 

KW Hunan I cell reactive feline protein? TRFPi epitope; recombitope 

KW sensitivity; Felis domesticus. 

OS Synthetic. 

FN Key Location/Qualifiers 

FT Cleavage.site 14.. 15 

FT /label= thronbin_cleavage_site 

PN W09308280-A. 

PD 29-APR-1993. 

PF 16-0CT-1992? U08694. 

PR 16-0CT-1991; US-777859. 

PR 13-DEC-I991? US-807529. 

PA (IMHU-) IMMULOGIC PHARtl CORP. 

PI Bond JF, Garnan RDr Kuo M, Horgenstern JPi tlorville M; 

PI Roqers BL; 

DR HPIJ 93-152473/18. 

DR N-PSDB; G41572. 

PT Recombitope peptide having T-cell stimulating activity - for the 

PT diagnosis and treatment of sensitivity to protein allergens, 
PT autoiantigens and protein antigens 
PS Disclosure; Fig Si 73ppJ English. 

CC Freferr^ reconbitope peptides for treating sensitivity to Felis 

rr domesticus are derived from the the genus Felis and comprise 

CC regions selected fron peptides X, Y, Z, A and Bi of TRFP, and 

CC modifications thereof i such as peptide C. 

CC Oligonucleotides C, D, E, F. G, H and I are used in the 

CC construction of reconbitope peptide YZX. 

SQ Sequence 96 AA? 

DB 75 Score 184; Match 100.0'/.? Predicted No. 4.68e-i2? 

Hatches 27? Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Db 17 kalpvvlenarilkncvdaknteedke 43 

Qy 1 KALPWLENARIL^ 27 



RESULT 5 

ID R12119 standard; Protein? 94 AA. 

AC R12119; 

DT 26-JUL-1991 (first entry) 

DE TRFP chain 1 with leader A. 

KW Hunan T cell reactive feline protein; cat allergens. 

OS Felis catus. 

PH Key Location/Qualifiers 

FT Peptide 3.. 24 

FT /label= Leader B 

FT Protein 25.. 94 

FT /label= TRFP Chain 1 

PN H09106571-A. 

PD 16-HAY-1991. 

PF 02-NGV-1990; U06548. 

PR 03-N0V-1989; US-431565. 

PA (IHHIM IHMULOGIC PHARM COR. 

PI Gefter ML, Garman RDr Greenstein JL, Juo M» Rogers I 

PI Brauer AW; 

^ 1 T » _cn - t f ' "I i too 



DR N-PSDB; Q11836. . 

PT Hew pure covalently linked human T cell reactive feline protein 

PT and modified peptide(s), used to reduce effects of cat allergens 

PT and to diagnose sensitivity to allergens. 

PS Claim 2; Fig U 70pp; English. 

CC Poly-A mRNA from cat parotid and mandibular glands was used to 

rc produce cDNA clones for both chain 1 and chain 2 of TRFP. These 

CC clones uere then used to screen a cat genomic library. Chain 1 

cc exists in tuo forms having different leader sequences (A and B) . 

CC The sequence can be used to express the protein and peptide derivs. 

CC which stimulate T-cells in persons allergic to cats. The peptides 

rc can be used to reduce/el iminate the allergic response partic. by 

CC modificn. of lynphokine prodn. by the T-cells. They car, also be 

CC used to identify epitopes responsible for sensitivity. The DNA can 

CC be uspd to detect comparable sequence in other species, and also 

CC for prodn. of modified forms of TRFP esp. showing reduced binding 

CC to IgE and thus reduced tendency to cause adverse reactions. 

CC See also R12120-R12123. 

SQ Sequence 94 AA; 

DB 3! Score 184; Match 100.0%; Predicted No. 4.68e-12; 

Hatches 27; Conservative 0; Mismatches 05 Indels 0, Gaps 0, 

Db 53 kalpvvlenarilkncvdakmteedke 79 

niiiiiiiiimiiiiiiiiiiiii 

Qy 1 KALPVVLENARILKNCVDAKMTEEDKE 27 



RESULT 6 

ID R36539 standard; Protein! 92 AA. 

AC R36539; 

DT 12-AUG-1993 (first entry) 

DE TRFP chain 1 (with Leader A!. 

KW Hunan T cell reactive feline protein; TRFP; leader A; leader B> 

KW epitope. 

OS Felis. 

FH Location/Qualifiers 

FT Peptide 1..22 

FT /label= leader_peptide 

PN W09308280-A. 

PD 29-APR-1993. 

PF 16-0CT-1992J U08694. 

PR 16-0CT-1991; US-777859. 

PR 13-DEC-1991; US-807529. 

PA (IMMU-) IMMULOGIC PHARM CORP. 

PI Bond JF, Garnan RD. Kuo M> Morgenstern JP, Morville n: 

PI Rogers BL*. 

DR WPI; 93-152473/18. 

DR N-PSDB; Q41556. . , hfl 

PT Recombitope peptide having T-cell stimulating activity - for the 

PT diagnosis and treatment of sensitivity to protein allergens, 

PT auto J antigens and protein antigens 

PS Disclosure; Fig l; 73pp; English. 

CC Chains 1 and 2 of the TRFP have been recombinant I y expressed in E. 

cc ro\i and purified. T cell epitope studies using overlapping peptide 

CC regions derived from the TRFP amino acids sequence uere used to 

CC identify multiple T cell epitopes in each chain of TRFP. 

SQ Sequence 92 AA; 

DB 7; Score 184; Match 100.07.; Predicted No. 4.68e-12; 



Matches 



27; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Db 51 kalpvvlenarilkncvdakmteedke 77 
Qy 1 KALPVVLENARILKNciDAKMTEEDKE 27 



RESULT 7 

ID R36543 standard; Protein; 27 AA. 

AC R36543; 

DT 12-AUG-1993 (first entry) 

DE Peptide Y. 

KW Hunan T cell reactive feline protein; TRFP; epitope; recombitope. 

OS Felis. 

PN W0930828G-A. 

PD 29-APR-1993. 

PF 16-0CT-1992! U08694 . 

PR 16-QCT-1991; US-777859. 

PR 13-DEC-1991; US-807529. 

PA (IMMU-) IMMULOGIC PHARM CORP. 

PI Bond JF, Garman RD , Kuo M, Morgenstern JP, Morville M; 

PI Rogers BL; 

DR WPI; 93-152473/18. 

PT Reconbitope peptide having T-cell stimulating activity - for the 

PT diagnosis and treatment of sensitivity to protein allergens, 

PT auto: antigens and protein antigens 

PS Disclosure; Fig 4; 73pp; English. 

CC Chains 1 and 2 of the TRFP have been recombinant^ expressed in E. 

CC coli and purified. T cell epitope studies using overlapping peptide 

fC regions derived from the TRFP amino acids sequence uere used to 

CC identify multiple T cell epitopes in each chain of TRFP. DNA 

CC constructs were assembled in which 3 regions (encoding peptides X, 

CC Y and Z) uere linked to produce DNA constructs encoding recombitope- 

CC peptides. 

SQ Sequence 27 AA; 

DB 7! Score 184; Match 100.07.; Predicted No. 4.68e-12; 

Matches 27; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Db 1 kalpvvlenari Ikncvdakmteedke 27 

Qy 1 KALpivLENARILKNCVDAKMTEEDK 27 



RESULT 8 



ID 

AC 

DT 

DE 

KW 

KW 

OS 

FH 

FT 

FT 

PN 

PD 

PF 

PR 

PR 

PA 

PI 

PI 

DR 

DR 

PT 

PT 

PT 

PS 

CC 

CC 



R36540 standard; Protein; 88 AA. 
R36540; 

12- AUG-1993 (first entry) 
TRFP chain 1 (with Leader B) . 

Hunan T cell reactive feline protein; 
epitope. 
Felis. 
Key 

Peptide 
/label= 
W09308280-A. 
29-APR-1993. 
1&-0CT-1992J 
16-0CT-1991; 

13- DEC-1991! 



TRFP ; leader A; leader B; 



Location/Qual if iers 
1..18 
leader_peptide 



U08694. 
US-777859. 
US-807529. 
(IMMU-) IMMULOGIC PHARM CORP. 

Bond JF, Garman RD, Kuo M, Morgenstern JP, Morville Mi 

Rogers BL; 

WPl; 93-152473/18. 

N-PSDBJ 941557. 

Recombitope peptide having T-cell stimulating activity - for the 
diagnosis and treatment of sensitivity to protein allergens, 
autoJantigens and protein antigens 
Disclosure; Fig i; 73pp; English. 

Chains 1 and 2 of the TRFP have been reconbinantly expressed in E. 
coli and purified. T cell epitope studies using overlapping peptide 



CC identify multiple T cell epitopes in each chain of TRFP. 
SQ Sequence 88 AA; 

DB 7! ScorA 184; Match 100.07.; Predicted No. 4.68e-12; 
Matches 27; Conservative 0; Mismatches 0; Indels 0; Gaps 

Db 47 kalpvvlenarilfcncvdakmteedke 73 

Qy 1 KAL.PVVLENARILKNCVDAKMTEEDKE 27 



RESULT 9 

ID R12120 standard; Protein; 96 AA. 

AC R12120; 

DT 26-JUL-1991 (first entry) 

DE TRFP chain 1 with leader B. 

KM Hunan T cell reactive feline protein; cat allergens. 

OS Felis catus. 

FH Key Location/Qualifiers 

FT Peptide 9.. 26 

FT /label= Leader B 

FT Protein 27.. 96 

FT /label= TRFP Chain 1 

PN W09106571-A. 

PD 16-MAY-1991. 

PF 02-NQV-199G; U06548. 

PR 03-NQV-1989; US-431565. 

PA (IMMU-) IMMULOGIC PHARM COR. 

PI Gefter ML. Garman RD, Greenstein JL, Juo M, Rogers BL ; 

PI Brauer AW; 

DR WPI; 91-164136/22. 

DR N-PSDB; Q11837. 

PT N»u pure covalently linked human T cell reactive feline protein - 
PT and modified peptideU), used to reduce effects of cat allergens 
PT and to diagnose sensitivity to allergens. 
PS Claim 2; Fig i; 70pp? English. 

CC Poly-A piRNA from cat parotid and mandibular glands was used to 
rc produce cDNA clones for both chain 1 and chain 2 of TRF, . These 
CC clones were then used to screen a cat genomic library. Chain 1 
Cf exists in two forms having different leader sequences (A and B> . 
CC The spqupnce can be used to express the protein and peptide den vs. 
CC which stimulate T-cells in persons allergic to cats. The peptides 
CP can bp used to reduce/eliminate the allergic response partic. by 
CC modificn. of lynphokine prodn. by the T-cells. They can also be 
CC used to identify epitopes responsible for sensitivity. The DNA can 
rc be used to detect comparable sequence in other species, and also 
CC for prodn. of modified forms of TRFP esp. showing reduced binding 
CC to IgE and thus reduced tendency to cause adverse reactions. 
CC See also R121 19-R12123. 
SQ Sequence 96 AA; 

DB 3i Score 184; Match 100. OX; Predicted No. 4.68e-12; 
Matches 27; Conservative 0; Mismatches 0; Indels 0; Gaps 

Db 55 kalpvvlenarilkncvdakmteedke 81 

i - - - ■ 

i 



Qy 1 KALPVVLENARILKNCVDAKMTEEDKE 27 



RESULT 10 

ID R27368 standard; protein; 96 AA. 
AC R27368; 

DT 25-FEB-1993 (first entry) 

DE TRFP Chain #1 with CI leader B sequence. 

KW T cpII reactive feline protein; cat allergy; allergic; IgE; 



OS Felis domesticus. 

FH Key Location/Qualifiers 

FT Peptide 1..27 

FT /label= Leader B 

FT Protein 28.. 96 

FT /label= TRFP chain #1 

PN W09215613-A. 

PD 17-SEP-1992. 

PF 20-FEB-1992J U01344. 

PR 28-FEB-1991? US-662193. 

PA (IMMU-) IMMULOGIC PHARM CORP. 

PI Bond J, Kuo H? 

DR WPI! 92-331670/40. . ,„»«t«h 

PT Modified human T-cell reactive feline protein - stimulates T-ceu 

PT in individuals allergic to cats and shows reduced 

PT histamine-releasing properties 

PS Claim 1? Fig 1J 35ppJ English. ral<M 

CC This sequence represents a modified human T-cell reactive feline 

CC protein which stimulates T-cells from an individual uho is allergic 

CC to cats, but which interacts with human IgE to a lesser extent than 

rc does affinity purified TRFP. The protein is modified by treating 

CC uith either a mild alkali (pH 12.5-13.5 , KOH, NaOH, LiOH or tertiary 

CC amines) or an enzyme which removes 0-linked groups (carbohydrate 

CC moieties). It is useful in desensitising people uho are allergic to cats. 

SQ Sequence 96 AA; 

DB 5J Score 184; Match 100.0%; Predicted No. t.bte-W 

Matches 27? Conservative 0? Mismatches 0! Indels 0, Gaps 0, 

Db 55 kalpvvlenarilkncvdakmteedke 81 

Qy 1 KALPVVLENARILKNCVDAKMTEEDKE 27 



RESULT 11 

ID R27367 standard; protein; 94 AA. 

AC R273675 

DT 25-FEB-1993 (first entry) 

DE TRFP Chain #1 with CI leader A sequence. 

KW T cell reactive feline protein. 

OS Felis domesticus. 

FH Key Location/Qualifiers 

FT Peptide 1..25 

FT /label= Leader A 

FT Protein 25.. 94 

FT /label= TRFP chain #1 

PN W09215613-A. 

PD 17-SEP-1992. 

PF 20-FEE-1992; U01344. 

PR 28-FEB-1991; US-662193. 

PA ( IMMU-) IMMULOGIC PHARM CORP. 

PI Bond Jr Kuo M; 

DR WPI! 92-331670/40. 

PT Modified human T-cell reactive feline protein - stimulates T-cell 
PT in individuals allergic to cats and shows reduced 
PT histamine-releasing properties 
PS Claim U Fig 1; 35pp; English. 

CC This sequence represents a modified human T-cell reactive feline 

CC protein which stimulates T-cells from an individual uho is allergic 

CC to cats, but which interacts uith human IgE to a lesser extent than 

re does affinity purified TRFP. The protein is modified by treating 

CC uith either a mild alkali (pH 12.5-13.5 , K0H, NaOH, LiOH or tertiary 

CC amines) or an enzyme which removes 0-linked groups (carbohydrate 

CC moieties). It is useful in desensitising people uho are allergic to cats. 

SQ Sequence 94 AA; 



DB 55 Score 184? Match iOO.0%; Predicted No. 4.68e-12; 
Matches 27? Conservative 0? Mismatches 0; Indels 

Db 53 kalpvvlenarilkncvdakmteedke 79 

1 i 1 1 ! 1 1 1 1 M M 1 1 1 E It 1 1 1 1 1 II 1 
Qu 1 KALPVVLENARILKNCVDAKMTEEDKE 27 



RESULT 12 

ID P91948 standard? protein? 333 AA. 

AC P91948? 

DT 16-FEB-1990 (first entry) 

DE Pig H4 isoenzyme. 

KW MAD-dependent lactate dehydrogenase J H4 isoenzyme. 

OS Suidae. 

FH Key Location/Qualifiers 

FT Binding-site 98.. 110 

FT /note="substrate recognition site. 1 ' 

FT Binding-site 167. .173 

FT /note="activator site." 

FT Misc-dif ference 102.. 102 

FT /note= r, basic AA." 

FT Misc-difference 173. .173 

FT /note="basic AA." 

PN W0890B7G7-A. 

PD 21-SEP-1989. 

PF 16-MAR-1989? G00279. 

PR 17-MAR-1988? GB-006358. 

PA (UYBR-) University of Bristol- 

PI Holbrook JJi Clarke AR* Atkinson A? 

BR HPI ? 89-292522/40. , . 

PT Recombinant MAD-dependent dehydrogenase - uhich interconverts malate 
PT and oxaloacetate. and has low dependence on f ructose-l r6-biphosphate 
PT as activator. 

PS Disclosure; page 4-5? 25pp? English, 

CC Sequence codes for the H4 isoenzyme of pig - an MAD-dependent lactate 

CC dehydrogenase. It is used to construct a recombinant enzyme in 

CC AA102 and AA173 are basic, esp. Arg, and Gin resp. The mutation of AA102 

cc rpsults in the creation of a malate dehydrogenase from the lactate 

CC dehydrogenase framework, the nutation being on the mobile coenzyme loop 

CC and changing the substrate binding specificity of the protein. The 

fC mutation of AA173, which is in the activation site, decreases 

CC sensitivity of the protein to activation by sugar phosphates. 

SQ Sequence 333 AA; 

DB i; Score 68; Match 37.5XJ Predicted No. 6.74e+00; 

Matches 9; Conservative 8; Mismatches 6; Indels ..l; Gaps 1, 



Db 290 slpcvl-nargltsvinqklkdde 312 

Ml II III M " I 5 ;!; 
Qy 2 ALPVVLENARILKNCVDAKMTEED 25 



RESULT 13 

ID R43253 standard? Protein; 1513 AA. 

AC R43253? 

DT 04-MAY-1994 (first entry) 

DE p!90 protein. . , 

KW pl90; phosphoprotein; GTPase activating protein; GAP; n-chinerin, 

KW nitogpnically-stinulated cells; tyrosine kinase-transforned cells; 

KW GAP-associated protein; homology; superfamily; signal transduction; 

KW transcription repressor; GRF-1J BCR; breakpoint cluster region gene; 

KW p21ras; effector; GTPase; mitogenic. 

OS Rattus rattus. 

FH Key Location/Qualifiers 



FT /note= "Region of homology to GTPase superf ami ly" 

FT Region 389.. 1166 

FT /note= "Region of homology to GRF-1" 

FT Region 1268.. 1429 

FT /note= "Regionof homology to BCR/n-chimerm" 

PN W09320201-A. 

PD 14-0CT-1993. 

PF 31-MAR-1993; U03076. 

PR 31-MAR-1992! US-861207. 

PA (WHED ) WHITEHEAD INST BIOMEDICAL RES. 

PI Settleman JE, Weinberg RA; 

DR WPI; 93-336909/42. 

DR N-PSDB; Q50168. . 

PT New GAP-associated protein P190 - uihich can be inhibited to 

PT interfere with RAS oncogene(s) in pathogenesis of malignancies 

PS Claim 4! Page 42-55; 95ppJ English. 

CC This sequence represents the P 190 protein. P 190 is a phosphoprotein 

CC which is tightly bound to GTPase activating protein (GAP) in 

rc Fiitogpnically-stimulated and tyrosine kinase-transformed cells. 

CC pl90 is GAP-associated protein. P 190 has three distinct domains, 

CC each of which exhibits homology to a previously described sequence. 

TC Towards the amino terminal end of P 190, a domain spanning 201 amino 

CC acids exhibits significant sequence similarity to all members of the 

rc GTPasp super-family. The most stricking similarity in the predicted 

r C amino'acid sequence of P 190. is to a 95 kD protein encoded by a human 

CC cDNA uihich is reported to function as a transcription repressor. The 

CC reported amino acid sequence of the represssor protein, GRF-1, is 

CC identical over a 778 amino acid fragment of pl90. Towards the carboxy 

CC terminal of P 190 there is a region of 161 residues which shows homology 

CC with two proteins which are involved in signal transduction. It has 

rc been suggested that p21ras acts as a regulatory subunit of pi90 

CC protein, which acts as the p21ras effector and which releases mitogemc 

CC signals when prompted to do so by activated GTP-bound p21ras. P 190 may 

CC also, acting via GAP, transduce signals from p21ras to the nucleus, 
CC affecting expression of specific cellular genes. 

SQ Sequence 1513 AA; 

DB 8; Scorp 60; Match 50. OX? Predicted No. 3.60e+0l; 

Matches 7! Conservative 3; Mismatches 4; Indels Of Gaps 0; 

Db 78 evsrsledcveckm 91 

I 'I I 'II' II 
Qy 8 ENARILKNCVDAKM 21 



RESULT 14 

ID R04571 standard; protein; 520 AA. 

AC R04571? 

DT 14-SEP-1990 (first entry) 

DE 0RF3 product from the mos gene. 

KW Rhizopine; mos gene; moc gene; nitrogen fixation; Medicago sativa. 

OS Rhizobium meliloti strain L5-30. 

PN AU8941262-A. 

PD 15-MAR-1990. 

PF 08-SEP-1988; A41262. 

PR 08-SEP-1988; AU-000328. 

PA (LUMI-) Luminis PTY Ltd. . 

PI Temp J, Kondorosi A, Putnoky P, Murphy PJ, Schell JS, De Bruijn FJ. 

DR WPI; 90-139827/19. 

DR N-PSDB; Q04303. . 

PT Bacteria contg. genes for rhizopine synthesis and catabolism - esp. 

PT Rhizobium strains for increasing nitrogen fixation and growth m 

PT leguminous plants. 

PS Disclosure; p; English. 

rc The mos ORF 3 product is a protein of a predicted size of 35.8kD. 



CC to'catabolise'rhizopine compounds are used to increase symbiotic nitrogen 

CC fixation in Leguninaceae, esp. alfalfa. Where a moc gene is 

CC present in separate bacteria both N-fixation and plant grouth can be 

TC promoted. Alternatively, (nos genes are expressed in the plant and only 

rc noc in the bacteria, this will cause desirable soil bacteria (eg being 

CC used for biological control of a pathogen) to be held in the rhizosphere. 

CC See also R04569-72. 

SQ Sequence 520 AA? 

DB U Score 60? Match 58. 37.! Predicted No. 3.60e+01i 



Matches 



7? Conservative 5; Mismatches 0; Indels 0? Gaps 0? 



Db 4 Iplvlqngqimk 15 

I P I I : 1 5 M 5 I 
Qy 3 LPVVLENARILK 14 



RESULT 15 

ID P92275 standard; peptide; 765 AA. 

AC P92275? 

DT 27-Feb-199Q (first entry) 

DE Hunan topoi somerase I cDNA 

KM Scleroderma. 

OS Homo sapiens (human). 

PN W08909222-A. 

PD 05-0CT-1989. 

PF 22-MAR-1989? U01116. 

PR 23-MAR-1988; US-172159. . 

PA (BRIG) Brigham and Women's Hospital? (UYJO) John's Hopkins Univ. 

PI Earnshau WC, D'Arpa P? 

DR WPI? 89-309500/42. 

DR N-PSDBJ N91475. ■ 

PT Cloned cDNA encoding eukaryotic topoisonerase I - useful for large scale 

PT prodn. by recombinant methods 

PS Claim 6> fig. 5? 28pp? English. 

rc The cDNA of this can be spliced into DNA vectors and used to transform 

CC hosts for high yield. This polypeptide (I) retains the ability tobind 

rc autoantibodies, even though the prokaryotic host degrades transcribed I 
CC into a spectrum of polypeptides. (I) may be used to classify patients 

CC with immune rheumatic diseases. 

SQ Sequence 765 AA? 



DB 1? Score 

Matches 10? Conservative 



59? Match 50.07.; Predicted No. 4.41e+01? 

5? Mismatches 4; Indels 1? Gaps 1? 



Db 445 etarrlkkcvd-k irnqyre 463 

I II I I - 1 I I M : l 
Qy 8 ENARILKNCVDAKNTEEDKE 27 

Search completed? Fri Mar 24 07?44?12 1995 
Job time ? 11 sees. 
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MPsrch.pp protein - protein database search, using Snith-Waternan algorithm 

Run on! 



Fri Hap 24 07**3:20 1995? MasPar tine 4.72 Seconds 
hrl 128.633 Million cell updates/sec 

Tabular output not generated. 



Title: 

Description: 
Perfect Score? 
Sequence^ 

Scoring table! 

Searched: 
Database : 



>US-08-3W^9^2 

(1?27) fron US083005IO.pep 

184 

1 KALPVVLENARILKNCVDAKHTEEDKE 27 

PAM 150 
Gap 14 

75511 seqs* 22468834 residues 



pir43 




1 


ANN01 


2 


ANNQ2 


3 


ANN03 


4 


UNANN01 


5 


UNANNQ2 


6 


UNANNG3 


7 


UNANN04 


8 


UNANN05 


9 


UNANN06 


10 


UNREV1 


11 


UNREV2 


12 


UNREV3 



Statistics; 



tfean 29.063; S^ariancB 55.105; scale 0,527 



Prpdict.d No. is the number of results expected by chance to have a 
s ore greater than or equal to the score of P" nUd ' 
and is derived bu analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID 



Description 



Pred. No. 



1 


178 


96.7 


88 


9 


JC1126 


2 


178 


96.7 


92 


9 


JC1136 


3 


80 


43.5 


40 


9 


A53283 


4 


79 


42.9 


2333 


3 


GNNY2F 


5 


75 


40.8 


2336 


5 


S37077 


6 


72 


39.1 


2332 


3 


GNNYF 


7 


71 


38.6 


136 


7 


S38598 


8 


71 


38.6 


374 


8 


S28285 


9 


70 


38.0 


386 


11 


537691 


10 


69 


37.5 


470 


5 


S02068 


11 


69 


37.5 


470 


5 


JN0431 


12 


68 


37.0 


333 


1 


DEPGLH 


13 


66 


35.9 


217 


4 


B32957 


14 


66 


35.9 


334 


4 


S02795 


15 


66 


35.9 


334 


4 


S09954 


16 


66 


35.9 


2332 


3 


GNNY4F 


17 


66 


35.9 


1830 


2 


S19188 


18 


64 


34.8 


147 


5 


S07158 


19 


64 


34.8 


412 


2 


KRSHL1 


20 


64 


34.8 


314 


10 


S31402 


21 


63 


34.2 


529 


5 


A24031 


22 


63 


34.2 


4427 


10 


S25021 








on ' 




1 L J Pi fl 1 A 



major allergen chain 
najor allergen chain 
najor cat allergen F 
genome polyprotein - 
genome polyprotein - 
genome polyprotein - 
hypothetical protein 
hypothetical protein 
rnalp protein - fiss 
RNA-directed RNA pol 
RNA-directed RNA pol 
L-lactate dehydrogen 
L-lactate dehydrogen 
L-lactate dehydrogen 
L-lactate dehydrogen 
genome polyprotein - 
myosin-V - chicken 
keratin type I compo 
keratin* 48K type I 
3-methylcatechol 2*3 
genome polyprotein - 
probable polyketide 



1.31e-19 

1.31e-19 

4.90e-02 

6.98e-02 

2.80e-01 

7.76e-01 

i.09e+00 

i.09e+00 

1.51e+00 

2.10e+00 

2. 10e+00 

2.91e+00 

5.55e+00 

5.55e+00 

5.55e+00 

5.55e+00 

5.55e+00 

1.05e+01 

i.05e+01 

1.05e+01 

1.43e+01 

1 ,43e+01 

1 



24 


63 


34.2 


906 


7 


S32607 


25 


63 


34.2 


906 


7 


JT0350 


26 


62 


33.7 


419 


5 


A25438 


27 


62 


33.7 


404 


5 


JS0073 


28 


61 


33.2 


873 


1 


TVFVFS 


29 


61 


33.2 


873 


1 


TVFVF 


30 


61 


33.2 


'90 


6 


A49923 


31 


61 


33.2 


277 


6 


JN0751 


32 


60 


32.6 


1226 


11 


S48837 


33 


60 


32.6 


1493 


9 


A38218 


34 


59 


32.1 


900 


7 


S25322 


35 


59 


32.1 


767 


12 


S32698 


36 


59 


32.1 


767 


12 


S32697 


37 


59 


32.1 


135 


11 


S46635 


38 


59 


32.1 


765 


1 


ISHUT1 


39 


59 


32.1 


767 


4 


JU0144 


40 


59 


32.1 


567 


6 


A40899 


41 


59 


32.1 


400 


8 


A61556 


42 


59 


32.1 


631 


9 


A31203 


43 


59 


32.1 


416 


9 


A61404 


44 


59 


32.1 


1178 


9 


A47255 


45 


59 


32.1 


698 


3 


IKEC5B 



tr afunctional enzyme i.43e+01 

hydratase (EC 4.2.1. 1.43e+01 

keratin, type I cyto i.95e+01 

keratin. 47. 6K type 1.95e+01 

protein-tyrosine kin 2.64e+01 

protein-tyrosine kin 2,64e+Gl 

translation initiati 2.64e+01 

Outer membrane 30K p 2.64e+01 

kinesin-like protein 3.57e+01 

GAP-associated prote 3.57e+01 

Afunctional beta-ox 4.81e+Gl 

DNA topoisomerase (E 4.81e+01 

DNA topoisomerase (E 4.81e+01 

hypothetical protein 4.81e+0i 

DNA topoisomerase (E 4.81e+01 

DNA topoisomerase £E 4,81e+01 

gaq polyprotein - Ch 4.81e+01 

keratin 19, cytoskel 4.81e+01 

interf er on-regulated 4.81e+01 

keratin Ar type I - 4.81e+01 

pyruvate carboxylase 4.81e+01 

colicin V secretion 4.81e+01 



ALIGNMENTS 



RESULT 
ENTRY 
TITLE 
ORGANISM 
DATE 



1 



JC1126 Hype complete 
major allergen chain 1 precursor B - cat 

♦formal_nane Felis silvestris catus ♦common_name domestic cat 
31-Dec-1993 &sequence_revision 31-Dec-1993 ♦text_change 

31-Dec-1993 
JC1126 
JC1126 

Griffith, I.J. J Craigr S.J Pollock, J.? Yur X.B.; 

Morgenstern, J.P.J Rogers, B.L. 
Gene (1992) 113:263-268 

Expression and genomic structure of the genes encoding Fdl, 

the major allergen from the domestic cat* 
JC1126 
♦#molecule_type DNA 
♦♦residues 1-88 ♦♦label GRI 

GENETICS 

Chi 

17/1? 79/3 



ACCESSIONS 
REFERENCE 
♦authors 

♦journal 
♦title 

♦accession 



♦gene 
♦introns 
FEATURE 
1-18 
19-88 

SUMMARY 



♦domain signal sequence ♦status predicted ♦label SIG\ 
♦product major allergen chain 1 #status predicted ♦label 
MAT 

♦length 88 ♦molecular-weight 9586 ♦checksum 4095 



DB 9; Score 178; Match 96.37.; Predicted No. 1.31e-19; 

Matches 26; Conservative 1? Mismatches 0; Indels 0; Gaps 

Db 47 nalpvvlenarilkncvdakmteedke 73 

Sy 1 KALPVVLENA^ 27 



o; 



RESULT 2 

ENTRY JC1136 ♦type complete 

TITLE major allergen chain 1 precursor A - cat 

ORGANISM #fornal.nane Felis silvestris catus ♦common.name domestic 

DATE 31-Dec-1993 Ssequence^revis ion 31-Dec-1993 ttext.change 

31-Dec-1993 



REFERENCE JC1126 

^authors Griffith, I.J.J Craig, S.; Pollock, J.} Yu, X.B.; 

Morgenstern, J.P.J Rogers, B.L. 
Sjournal Gene (1992) 113:263-268 

ttitle Expression and genomic structure of the genes encoding Fdl, 

the major allergen from the domestic cat. 

#accession JC1136 

##molecule_type DNA 

##residues 1-92 Mlabel GRI 

GENETICS 

#gene Chi 
ffintrons 21/1 J 83/3 

FEA I?M Sdomain signal sequence ftstatus predicted #label SIGN 

23 _ 92 Sproduct major allergen chain 1 Sstatus predicted #label 

MAT 

SUMMARY #length 92 #molecul ar-ueight 10072 ^checksum 4988 

DB 95 Score 178J Match 96.3XJ Predicted No. l;31e-19l 

Matches 26; Conservative U Mismatches 0J Indels 0, Gaps 0. 

Db 51 nalpwlenarilkncvdakmteedke 77 

MIMMMM i II Ml M IMIMII 
Qy 1 KALPVVLENARILKNCVDAKMTEEDKE 27 



RESULT 
ENTRY 
TITLE 
ORGANISM 
DATE 



A53283 #type fragment 

major cat allergen Fel d I alpha chain - cat (fragment) 
Sformal name Fel is silvestris catus #common_name domestic cat 
12-May-1994 8sequence_revis ion 12-May-1994 #text_change 

12-May-1994 
A53283 
A53283 

Duffort, O.A.J Carreira, J.J Nitti, G.J Polo, F.f Lombardero, 
M. 

Mol. Immunol. (1991) 28:301-309 

Studies on the biochemical structure of the major cat 

allergen Felis domesticus I. 
A53283 

preliminary 
tt#molecule_type protein 
Mresidues" 1-40 §*label DUF 
SUMMARY Slength 40 Schecksum 3032 

DB 9J Score 80; Match 100.07.; Predicted No. 4.90e-02; 

Matches 12; Conservative 0; Mismatches Oi Indels 0! 

Db 29 kalpvvlenari 40 
Qy 1 KALPVVLENARI 12 



ACCESSIONS 
REFERENCE 
Sauthors 

ijournal 
#title 

ftaccession ' 
ttfstatus 



Gaps 0; 



RESULT 

ENTRY 

TITLE 

CONTAINS 



ORGANISM 
DATE 



GNNY2F Stupe complete . 
genome polyprotein - foot-and-mouth disease virus A (strain 

AC10161) x • ♦ 

coat protein VPU coat protein VP2; coat protein VP3, coat 

protein VP4J core protein p52; genome-linked protein VPgi; 

gpnome-l inked protein VPg2; genome-linked protein VPg3; 

nonstructural protein p20aJ nonstructural protein p20b; 

RNA-directed RNA polymerase (EC 2.7.7.48) 
#formal_name Aphthovirus A #common_name foot-and-mouth 

disease virus A 
17-Dec-1982 #sequence_revis ion 28-Aug-1985 tttext.change 



ACCESSIONS 
REFERENCE 
♦authors 
♦journal 
♦title 



A93508? A9149U S30753 
A93508 

Carrol I r A.R.? Rowlands* D.J.J darker B.E. 
Nucleic Acids Res, (1984) 12:2461-2472 

The conplete nucleotide sequence of the RNA coding for the 
prinary translation product of foot and mouth disease 
virus . 



♦cross-references MUID:84169547 
♦accession A935Q8 

♦inolecule type genomic RNA 

♦♦residues 1-2333 ♦♦label CAR 

♦♦cross-references GB*X00429 
REFERENCE A91491 

♦authors Boothroydr J.C.J Harris: 

♦journal Gene (1982) 17:153-161 

♦title The nucleotide sequence 

proteins of foot-and-mouth disease virus, 
♦cross-references KUID:82211814 
♦accession A91491 

•tnolecule.type genomic RNA 

♦♦residues 115-395, 'C ,397-631 , r, 633-1048 Mlabel 

♦♦cross-references GB5V01130 



T.J. R.J Rowlands, D.J. 5 Loue, P. A. 
of cDNA coding for the structural 



BOO 



REFERENCE 
♦authors 
♦journal 
♦title 



^accession 
♦♦molecule 
fcres idues 



S30753 

Sangar, D.V.S Newton. S.E.f Rowlands, D.J.J Clarke. B.E. 
Nucleic Acids Res. (1987) 15:3305-3315 

All foot and mouth disease virus serotypes initiate protein 

synthesis at two separate AUGs. 
S30753 

type genomic RNA 

1-32 ♦♦label SAN 



♦♦cross-references EHBLiH31575 



CLASSIFICATION 
KEYWORDS 

FEATURE 
1-204 
205-286 
287-504 
505-725 
726-937 
938-1578 
1579-1601 
1602-1625 
1626-1649 
1650-1863 
1864-2333 

SUMMARY 



isuperfanily f oot-and-nouth disease virus genone polyprotein 
coat protein; core protein; genone-1 inked protein; 

nonstructural protein; nucleotidyltransferase; polyprotein 

tproduct nonstructural protein p20a tlabel NPA\ 
tproduct coat protein VP4 tlabel VP4\ 
•product coat protein VP2 tlabel VP2\ 
tproduct coat protein VP3 tlabel VP3\ 
tproduct coat protein VP1 tlabel VP1\ 
tproduct core protein p52 tlabel CPP\ 
tproduct genone-linked protein VPgl tlabel GL1\ 
tproduct genone-linked protein VPg2 tlabel GL2\ 
tproduct genone-l inked protein VPg3 tlabel GL3\ 
tproduct nonstructural protein p20b tlabel NPB\ 
tproduct RNA-directed RNA polymerase tlabel RRP 
tlength 2333 tnolecular-ueight 259646 tchecksun 7155 



DB 3; Score 79; Match 59. IX; Predicted No. 

Matches 13; Conservative 4; Mismatches 5! 

Db 1913 vvlddvifskhkgdakciteedk 1934 

MM: l : I III II IN 
Qy 5 VVLENARILKNCVDAKMTEEDK 26 



6.98e-02; 
Indels 



0; Gaps 0; 



RESULT 5 

ENTRY 

TITLE 

CONTAINS 



S37077 tttype complete 

genone polyprotein - f oot-and-aouth disease virus A (strain 
A22/550 Azerbaijan 65) 

coat protein VP1J coat protein VP2; coat protein VP3; coat 
protein VP4; core protein pl4; core protein pi9; core 
protein p4l; core protein X; genone-linked protein VPgl 5 
genone-linked protein VPg2; genone-linked protein VPg3; 
nonstructural protein p20a; proteinase (EC 3.4.-.-); 



DM4- ■ 



. -f, .4 DMA - - ' 



: » o i i t o i 



ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
•authors 



•submission 
•accession 
••nolecul 
••residue 
••cross-r 
CLASSIFICATION 
KEYWORDS 



•formal jiane Aphthovirus A #connon_nane foot-and-nouth 

disease virus A 
31-Dec-1993 •sequence.revision 3i-Dec-1993 #teKt_change 

31-Dec-1993 
S37077 

S37077 M 
Sosnovtsev, S.V.J Qnischenko. A.M.? Petrov, N.A.? 
Kalashnikovar T.I. J Hanaeva, N.V.i Drygin, V.Y.? 
Perevozchikova* N.A.J Vasilenko* S.K. 
subnitted to the EMBL Data Library. August 1993 
S37077 
e type qenonic RNA 
s" 1-2336 Mlabel SOS 

eferences EHBL:X74812 t . 

•superfanily foot-and-nouth disease virus genone polyprotein 
coat protein; core protein; genome-linked protein; 

nonstructural protein; nucleotidyltransferase; polyprotein 



FEATURE 
1-217 



213-286 

287-504 

505-724 

725-938 

939-954 

955-1108 

1109-1426 

1427-1579 

1580-1602 

1603-1626 

1627-1650 

1651-1863 
1864-2333 

SUMMARY 

DB 5J Score 
Matches 12; 



•product nonstructural protein p20a #status predicted 

#label NPA\ 
•product coat protein VP4 tstatus predicted 
•product coat protein W2 #status predicted 
•product coat protein VP3 tstatus predicted 
•product coat protein VP1 #status predicted 
•product core protein X #status predicted #1 
•product core protein p!4 tstatus predicted 
•product core protein p41 tstatus predicted 
•product core protein pl9 #status predicted 
•product genome-linked- protein VPgl tstatus 

•label VG1\ 

•product genone- 1 inked protein VPg2 #status 

•label VG2\ 
•product genone-l inked protein VPg3 tstatus 

•label VG3\ 

•product proteinase tstatus predicted ttlabel 
•product RNA-directed RNA polymerase •status 
•label RRP 

Hength 2336 #nolecular-ueight 259983 •checksum 



•label VP4\ 
•label VP2\ 
•label VP3\ 
•label VP1\ 
abel CPXN 
•label C14\ 
•label C41\ 
•label C19\ 
predicted 

predicted 

predicted 

PTS\ 

predicted 
4399 



75; Match 54.57.; Predicted No. 2.80e-Ol5 
Conservative 5; Mismatches 5; Indels 0; Gaps 0; 



Db 1913 vvldevifskhkgdtknteedk 1934 

Mi:: : |: IMMMM 
Qy 5 VVLENARILKNCVDAKHTEEDK 26 



RESULT 

ENTRY 

TITLE 

CONTAINS 



ORGANISM 

•note 
DATE 

ACCESSIONS 
REFERENCE 
•authors 
•journal 
•title 



GNNYF #type complete 

genone polyprotein - foot-and-nouth disease virus 0 (strains 
01K and 01BFS) 

coat protein VPU coat protein VP2; coat protein VP3; coat 
protein VP4; core protein pl2? core protein pl4; core 
protein P20b; core protein p34; core protein P56; core 
protein VPg; nonstructural protein p20a 

•fornaljiane Aphthovirus 0 •connon.name f oot-and-nouth 
disease virus 0 

host Artiodactyla (cloven-footed nannals) 

Ol-Sep-1981 tsequence.revision 27-Nov-1985 ttext_change 
08-Apr-1994 

A03907J A37503 

A03907 

Forsst S.J Strebelr K.; Beckr E.J Schaller, H. 
Nucleic Acids Res. (1984) 12:6587-6601 

Nucleotide sequence and genone organization of foot-and-mouth 



•cross-references MU I D 84297249 
^contents strain OIK 
•accession A03907 
##molecule type fnRNA 
••residues 1-2332 Hlabel FOR 

REFERENCE A37503 t n , . , . r 

•authors Kakoff, A.J.S Paynler, C.A.J Roulands, D.J.5 Boothroyd, J.C. 

•journal Nucleic Acids Res. (1982) 10:8285-8295 

•title Comparison of the anino acid sequence of the najor influnogen 

fron three serotypes of foot and nouth disease virus, 
•cross-references MUID:83143292 
•contents strain G1BFS 
•accession A37503 

«r:i^I:- tyPe ;:^ e 9.^.78l-807.'R'.809-860.'8'.862-9 S l HAK 

COMMENT The coat protein VP1 contains the main antigenic determinants of 
thp virion; therefore, changes in its sequence oust be 
responsible for the high antigenic variability of the virus. 

COMMENT Coat proteins VP2 and VP3 are related to the poliovirus coat 
proteins VP2 and VP3. 

CLASSIFICATION ftsuperf ami ly f oot-and-mouth disease virus genome polyprotein 

KEYWORDS coat protein; core protein; nonstructural protein; 

polyprotein 

FEA 1-217 ftproduct nonstructural protein p20a ftlabel NPA\ 

218-286 ftproduct coat protein VP4 ftlabel VP4\ 

287-504 ftproduct coat protein VP2 tlabel VP2\ 

505-724 ftproduct coat protein VP3 #label VP3\ 

725-937 ftproduct coat protein VP1 #label VP1\ 

938-1107 ftproduct core protein pl2 #label C12\ 

1108-1425 ftproduct core protein p34 ftlabel P34\ 

1426-1578 ftproduct core protein pl4 ftlabel C14\ 

1579-1649 ftproduct genome-linked protein VPg ftlabel VPG\ 

1650-1862 ftproduct nonstructural protein p20b ilabel P20\ 

1863-2332 ftproduct RNA-directed RNA polymerase ftlabel P56 

SUMMARY tlength 2332 ftnolecular-aeight 258925 ftchecksum 4170 

DB 35 Score 72; Match 50.0"/.; Predicted No. 7.76e-0lJ 

Matches 11? Conservative 6; Mismatches 5; Indels 0, Gaps 0, 

Db 1912 vvldevif skhkgdtkmseedk 1933 

|||!'. ' P Mi PI I II 
Qy 5 VVLENARILKNCVDAKMTEEDK 26 



RESULT 
ENTRY 
TITLE 



euglenid 



S38598 fttype fragment 
hypothetical protein 136 (rpl20 5' region) 

(Astasia longa) plastid (fragment) 
Sformal name plastid Astasia longa 

31-Dec-1993 ftsequence_revis ion 02-Aug-1994 fttext.change 

02-Aug-1994 
S38598 
S38590 

Gockelr G.i Baier, S.J Hachtel, W. 
submitted to the EMBL Data Library, November 1993 
S38598 
tt#molecule_type DNA 
ftftresidues 1-136 ftftlabel G0C 

ft#cross-ref erences EMBL;X75653 
KEYWORDS plastid 

SUMMARY Slength 136 ftchecksum 6797 

DB 7; Score 711 Match 46.77.; Predicted No. 1.09e+00i 

Matches 7; Conservative 5; Mismatches 3; Indels 0, Gaps 



ORGANISM 
DATE 

ACCESSIONS 

REFERENCE 
ftauthors 
ftsubmission 
^accession 



Db 120 Idddrilnvcvitrm 134 

I" III' II :; l 
Qy 7 LENARILKNCVDAKM 21 



RESULT 
ENTRY 
TITLE 
ORGANISM 
DATE 



6 



ACCESSIONS 

REFERENCE 
#authors 
#submission 
^accession 



S28285 #type complete 

hypothetical protein C38C10.1 - Caenorhabditis elegans 
iformal name Caenorhabditis elegans 

12-Mar-1993 8sequence_revision 12-Mar-1993 flext.change 

30-Sep-1993 
S28285 
S28285 
Thomas r K. 

submitted to the EMBL Data Library, December 1992 
S28285 
»§molecule type DNA 
ftiresidues 1-374 Sftlabel THO 

#Scross-references EMBL ;Z 191 53 

tintrons 6/2! 108/2; 149/3; 176/2; 225/3; 289/2? 349/1 

SUMMARY Slength 374 inolecular-ueight 42940 ^checksum 2438 

DB 8i Score 71; Match 22.2%; Predicted No. 1.09e+00i 

Matches 6? Conservative 14; Mismatches 7; Indels 0? Gaps 

Db 322 rsmaislqkgrvnsscldkkvkenssq 348 

::::: |5'.!|' MM l : P : 
Qy 1 KALPVVLENARILKNCVDAKMTEEDKE 27 



o; 



RESULT 9 

ENTRY 

TITLE 

ORGANISM 

DATE 

ACCESSIONS 
REFERENCE 
Sauthors 
#journal 
fttitle 



^accession 
##status 
ft#res idues 
##cross-re 
SUMMARY 



S37691 »type complete 

rnalp protein - fission yeast (Schizosaccharomyces ponbe) 
^formal name Schisosaccharomyces ponbe 

18-May-1994; #sequence_revi sion 18-May-1994? §text_change 

18-May-1994 
S37691 
S37691 

Melchior, F.; Weber, K. ; Gerker V. 
No I. Biol. Cell (1993) 4'.569-581 
A functional homologue of the RNA1 gene product in 
Schizosaccharomyces pombe: purification, biochemical 
characterization, and identification of a leucme-rich 
repeat motif . 
S37691 

prel iminary 
1-386 Sftlabel MEL 
ferences EMBL:X69882 
#length 386 ttmolecular-ueight 43235 ^checksum 8326 



DB Hi Score 70; Match 36.8"/.; Predicted No. 1.51e+00; 

Conservative 4; Mismatches 85 Indels 0; 



Matches 



7; 



Gaps 



Db 286 ieldavrtlktvidekmpd 304 

5 P III ! I H = 

Qy 5 VVLENARILKNCVDAKMTE 23 



RESULT 

ENTRY 

TITLE 



10 



ALTERNATE_NAMES 
ORGANISM 



S02068 #type complete 

RNA-directed RNA polymerase (EC 2.7.7.48) - foot-and-mouth 

disease virus A 
RNA replicase 

#formal_name Aphthovirus A #common_name foot-and-mouth 



DATE Ol-Dec-1989 ♦sequence_revis ion Ol-Dec-1989 #text_change 

30-5ep-1993' 
ACCESSIONS S02068 

REFERENCE S02068 . 

•authors Villaverde. A.! Martinez-Salas , E. » Domingo, fc. 

•journal J . Hoi. Biol. (1988) 204:771-776 

•title 3D gene of f oot-and-mouth disease virus. Conservation by 

convergence of average sequences, 
♦cross-references MUID:89141768 
•accession S02068 

♦♦molecule type nRNA 

•♦residues 1-470 Mlabel VIL 

••note 48-Glyr 68-Ala, 158-Val, 274-Ile. 306-Ile, 374-Leu, and 

444-Glu mere also found 
#$ no te sequence not compared to nucleotide translation 

GENETICS 

#qene 3D , . . 

CLASSIFICATION Ssuperfamily foot-and-mouth disease virus genome polyprotein 

KEYWORDS nucleotidyltransferase 

SUMMARY Slength 470 •molecular-ueight 52910 ^checksum 502 

DB 5: Scor* 69; Hatch 45.57.5 Predicted No. 2.10e+00i 

Matches 10? Conservative 7? Mismatches 5; Indels 0; Gaps 

Db 50 vvldevif srhkgdtkmseedk 71 

MM- : ;! 1 5 1 1 - 1 1 1 1 
Qy 5 VVLENARILKNCVDAKMTEEDK 26 



0; 



RESULT 
ENTRY 
TITLE 



11 



ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
♦authors 

♦journal 
♦title 



JN0431 •type complete 

RNA-directed RNA polymerase (EC 2.7.7.48) - foot-and-mouth 

disease virus A (strain A22) 
#formal_name Aphthovirus A ♦common_name foot-and-mouth 

disease virus A 
05-Mar-1993 #sequence_revis ion 05-Mar-1993 #text_change 

30-Sep-1993 
JN0431 

JN0431 , n . 

Ku Z min, I. V.; Rybakov, S.S.J Ivanyushchenkov, V.N. J Burdov, 

A.N. 

Bioorg. Khim. (1989) 15:419-422 

Nucleotide sequence of the FHDV A22 RNA polymerase gene, 
♦cross-references MU ID: 893021 83 
♦accession JN0431 
♦♦molecule type nRNA 
♦♦residues 1-470 tflabel KUZ 

♦♦not* this paper is in Russian, with an English abstract 

CLASSIFICATION ♦superfamily foot-and-mouth disease virus genome polyprotein 
KEYWORDS nucleotidyltransferase 

SUMMARY ♦length 470 •molecular-ueight 52657 ♦checksum 1182 

DB 5J Score 69; Match 50.0%; Predicted No. 2.10e+00; 

Matches 11? Conservative 5; Mismatches 6; Indels 05 Gaps 0; 

Db 50 vvldevifskhkgdtkmtaedk 71 

Ml- • \' 1 5 1 I I III 
Qy 5 VVLENARILKNCVDAKMTEEDK 26 



RESULT 12 

ENTRY DEPGLH ♦type complete 

TITLE L-lactate dehydrogenase (EC 1.1.1.27) chain H - pig 

ORGANISM •formal name Sus scrofa domestica ♦common.name domestic pig 

DATE ♦sequence revision 07-Hay-1981 ♦text.change 05-Aug-1994 



« journal Hoppe-Sey ler ' s I. Physiol. Chen. (1977) 358-123 127 

•titlt The primary structure of porcine lactate dehydrogenase: 

isoenzymes M-4 and H-4. 
♦cross-references MUID: 771 17453 
♦accession A91671 

KIL 

REFERENCE A94603 

•authors Kiltz, H.H. 

•submission submitted to the Atlas, October 1977 
•accession A94603 

••molecule type protein 

••residues 1-333 ft«label KI2 

REFERENCE A92870 

•authors Grau, U.N. 5 Trommer, W.E.; Rossmann, M.u. 

• journal J. Hoi. Biol. (1981) 151:239-307 

iSl" Structure of the active ternary complex of pig heart actate 

dehydrogenase with S-lac-NAD at 2.7 angstrom resolution, 
•cross-references MUID:82170431 

contents annotation; X-ray crystallography, 2.7 angstrom 

•note the structure of a complex uith a coenzyme-substrate analog 

mas solved 

COMMENT A tetramer of H chains is the predominant form of the enzyme in 
heart muscle. 

riASSIFICATlON Ssuperfanily L-lactate dehydrogenase 
CL acetylated amino end? NAD; oxidoreductase; tetramer 



KEYWORDS 
FEATURE 
1 



•modified.site acetylated amino end (Ala) #status 
experimental \ 

1A , ^active site Cys #status experimental 

•length 333 ~Smolecular-ueight 36476 •checksum 6356 



SUMMARY 

DB 1J Score 68; Match 37.5"/.; Predicted No. 2.91e+00; 

Matches 9; Conservative 8; Mismatches 6; Indels 1, Gaps 1, 

Db 290 slpcvl-nargltsvinqklkdde 312 

Ml I! Ill I 5 I s !!! 
Qy 2 ALPVVLENARILKNCVDAKMTEED 25 



RESULT 
ENTRY 
TITLE 



13 



ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
•authors 
•journal 
•title 



rabbit 



B32957 Stype fragment 

L-lactate dehydrogenase (EC 1.1.1.27) chain H 

(fragment) . 
•formal.nane Oryctolagus cuniculus Scommon.name domestic 

rabbit 

22-Nov-1989 #sequence_revision 22-Nov-1989 #text_change 

02-Aug-1994 
B32957 

Sass? 7 C.; Briand, H.s Benslimane, S.; Renaud, M. J Briand, Y. 
J. Biol. Chem. (1989) 264:4076-4081 

Characterization of rabbit lactate dehydrogenase-M and 
lactate dehydrogenase-H cDNAs. Control of lactate 
dehydrogenase expression in rabbit muscle, 
•cross-references MUID:89139477 
•accession B32957 

••status preliminary 
•#molecule_type mRNA 
••residues 1-217 ftftlabel SAS 

••cross-references GB:M22584; GB:J04595 
CLASSIFICATION Ssuperfamily L-lactate dehydrogenase 



SUMMARY 



•length 217 ^checksum 4425 



DB 4! Score 66? Match 33.3'/.; Predicted No. 5.55e+G0; 

Matches 8; Conservative 9? Mismatches 6? Indels 

Db 174 slpcil-nargltsvinqklkdde 196 

Mi M III I ; " I s ;: * 
Qu 2 ALPVVLENARILKNCVDAKHTEED 25 



RESULT 
ENTRY 
TITLE 
ORGANISM 
DATE 



14 



ACCESSIONS 
REFERENCE 
•authors 
•journal 
•title 



S02795 ttype complete 

L-lactate dehydrogenase (EC 1.1.1.27) B - human 
•formal name Homo sapiens •common_name nan 
Ol-Dec-1989 #sequence_revision 01-Dec~i989 •text^char.ge 

02-Aug-1994 
S02795? S06281 
S02795 

TakenOf T.I Lir S.S.L. 
Biochem. J« (1989) 257;921-924 

Structure of the human lactate dehydrogenase B gene, 
•cross-references MUID? 89193506 
•accession S02795 
••molecule_type DNA 
••residues 1-334 ••label TAK 

••cross-references EMBL J X 13794 
REFERENCE S06281 . 

•authors Sakai, I.? Sharief, F.S.? Pan, Y.C.E., Li, S.S.L. 

•journal Biochem. J. (1987) 248:933-936 

•title The cBNA and protein sequences of human lactate dehydrogenase 

B. 

•cross-references MUID: 88133965 
•accession S06281 

••molecule^type mRNA 

••residues 1-334 ••label SAK 

Unoie part of this sequence uas confirmed by protein 

sequenc ing 

GENETICS 

•gene GDB^LDHB 
•nap position 12pl2.2-pl2. 1 

•int^ons 43/3; 83/li 141/1? 199/1? 238/2; 279/3 

CLASSIFICATION •superfamily L-lactate dehydrogenase 
NAD; oxidoreductase 



KEYWORDS 
FEATURE 
2-334 

SUMMARY 

DE 4; Score 
Matches 8; 



•product L-lactate dehydrogenase B #status predicted 
•label MAT 

•length 334 #nolecular-ueight 36638 tchecksum 6440 



66? Match 33.37.; Predicted No. 5.55e+00i 
Conservative 9? Mismatches 6; Indels 1? 



Gaps 



1? 



Db 291 slpcil-nargltsvinqklkdde 313 

HIM IN I ; : : i ; 
Qy 2 ALPVVLENARILKNCVDAKHTEED 25 



15 



RESULT 

ENTRY 

TITLE 

ORGANISM 

DATE 



ACCESSIONS 
REFERENCE 
•authors 



S09954 ttype complete 

L-lactate dehydrogenase (EC 1,1.1.27) B - mouse 
•formal name Mus musculus #common_name house mouse 
12-Feb-1993 •sequence_revision 12-Feb-1993 •text_change 

02-Aug-1994 
S09954 

S09954 if u 

Hiraokar B.Y.J Sharief* F.S.J Yang* Y.W.i Lir W.H.s Lxr 



tjournal 
•title 



Eur. J. Biochetn. (1990) 189:215-220 

The cDNA and protein sequences of nouse lactate dehydrogenase 
B. Molecular evolution of vertebrate lactate dehydrogenase 
genes A (muscle) , B (heart) and C (testis), 
•cross-references MUID? 90249362 
•accession S09954 
Mmolecule type nRNA 
••residues 1-334 ••label HIR 

••cross-references EHBL*X51905 



••note 

CLASSIFICATION 

KEYWORDS 

SUMMARY 



the authors translated the codon CTG for residue 41 as 
Lys and AAT for residue 306 as Asp 
ftsuperfanily L-lactate dehydrogenase 
NAD; oxidoreductase 

Slength 334 #molecular-ueight 36572 ^checksum 6533 



DB 4; Score 

Hatches 8; Conservative 



66; Match 33.37.; Predicted No. 5.55e+00; 

9; Mismatches 6; Indels l; Gaps 1; 



Db 291 slpcil-nargltsvinqklkdde 313 

Ml J l III I 5 !: l ! ;:; 
Qy 2 ALPVVLENARILKNCVDAKKTEED 25 

Search completed: Fri Mar 24 07:43:41 1995 
Job tine : 21 sees. 
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RAleaso 2.0 John F. Collins & S. S. Sturrock, Bioconputing Research Unit. 
Copyright (c) 1993, 1994 by University of Edinburgh, U.K. 
Distribution rights by IntelliGenetics , Inc. 



MPsrch.pp protein - protein database search, using Smith-Waterman algorithm 
Run on: 



Fri Mar 24 07:42:48 1995; MasPar tine 3.42 Seconds 

111.824 Million cell updates/sec 

Tabular output not generated. 



Title: 

Description: 
Perfect Score: 
Sequence: 

Scoring table: 



Searched: 
Database: 



J^S-G,8-3.00-510-2 
**(i:27) fron US08300510.pep 
184 

1 KALPVVLENARILKNCvDAKMTEEDKE 27 



PAM 150 
Gap 14 

40292 seqs, 14147368 residues 



suiss-prot30 

1 SPT1 

2 SPT2 

3 SPT3 

4 SPT4 

5 SPT5 

6 SPT6 

7 SPT7 



Statistics: 



Mean 30.782; Variance 44.012; scale 0.699 



Predicted No. is the number of results expected by chance to hay* a 
scor* greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

7. 

Result Query 

No. Score Match Length DB ID Description 



Pred. No. 



1 

I 


184 


100.0 


? 


184 


100.0 


3 


79 


42.9 


4 


72 


39.1 


•j 


71 


38.6 


A 
a 


71 


38.6 


7 

/ 


68 


37.0 


A 


66 


35.9 


9 

7 


66 


35.9 


1 0 


66 


35.9 


1 1 

I I 


66 


35.9 




66 


35.9 


1 w 


64 


34.8 




63 


34.2 


1 W 


63 


34.2 


16 


62 


33.7 


1 7 


62 


33.7 


1 A 


62 


33.7 


19 


62 


33.7 


PO 


61 


33.2 


pi 


61 


33.2 


pp 


59 


32.1 


PI 
t_ W 


59 


32.1 




59 


32. 1 


P5 

£- w 


59 


32, 1 


P6 
CD 


59 


32.1 


P7 


59 


32.1 


Pfi 


59 


32. 1 


P9 

C 7 


59 


32. 1 


10 

WW 


59 


32. 1 




58 


31 .5 


IP 


58 


31 .5 


WW 


58 


31 .5 


34 


58 


31.5 


35 


58 


31.5 


36 


58 


31.5 


37 


58 


31.5 


38 


58 


31.5 


39 


57 


31.0 


40 


57 


31.0 


41 


57 


31.0 


42 


57 


31.0 


43 


57 


31.0 


44 


57 


31.0 


45 


57 


31.0 



38 
92 
2333 
2332 
374 
136 
333 
333 
2332 
217 
333 
1829 
215 
906 
861 
419 
412 
403 
247 
90 
873 
400 
698 
767 
1178 
1177 
765 
631 
900 
767 
846 
333 
423 
550 
249 
986 
845 
431 
2150 
1131 
906 
698 
906 
116 
192 



2 

2 

5 

5 

7 

7 

4 

4 

5 

4 

4 

4 

1 

3 

5 

4 

4 

4 

6 

3 

4 

4 

2 

6 

5 

5 

6 

4 

3 

6 

7 

7 

5 

5 

1 

2 

7 

4 

6 

7 

2 

2 

2 

6 

3 



FELB FELCA 
FELA_FELCA 
P0LG_FMDV1 

polgIfmdvg 

yld1~caeel 

yctp_astl0 

ldhh'pig 

ldhh_human 

p0lg_fmdva 

ldhhIrabit 

ldhh~m0use 

MYSD_CHICK 

acrrIecoli 
hde_cantr 

P0LG.FMDVS 
K1C4_XENLA 
K1M1~SHEEP 
K1M2.SHEEP 
SUMT_PSEFL 
IF1 CHLTR 

kfps_fujsv 
k1cs_human 
cvabIecqli 

T0P1.MQUSE 

PYC MOUSE 

PH81_YEAST 

T0P1_HUMAN 

HX1_M0USE 

F0X2_YEAST 

T0P1_CRIGR 

VAV_HUMAN 

XYNB_STRLI 

PSY_ARATH 

PHNL DESGI 

BA71 EUBSP 

EPIB.STAEP 

VAV_MQUSE 

K1CX.HUMAN 

SDC3_CAEEL 

YA19IYEAST 

CTNA HUMAN 

CRA(f DICDI 

CTNA_M0USE 

RL21 MARP0 

HS41 S0YBN 



MAJOR ALLERGEN I POLY 1.03e-27 

MAJOR ALLERGEN I POLY 1.03e-27 

GENOME PQLYPROTEIN (N 3.09e-03 

GENOME PQLYPROTEIN (N 6.49e-02 

PROBABLE G PROTEIN-CO 9.89e-02 

HYPOTHETICAL PROTEIN 9.89e-02 

L-LACTATE DEHYDR0GENA 3.42e-01 

L-LACTATE DEHYDR0GENA 7.66e-01 

GENOME POLYPROTEIN (N 7.66e-01 

L-LACTATE DEHYDROGENA 7.66e-01 

L-LACTATE DEHYDROGENA 7.66e-01 

DILUTE MYOSIN HEAVY C 7.66e-01 

POTENTIAL REPRESSOR F 1.69e+00 

HYDRATASE-DEHYDROGENA 2.49e+00 

GENOME POLYPROTEIN (C 2.49e+00 

KERATIN, TYPE I CYTOS 3.65e+00 

KERATIN, TYPE I MICRO 3.65e+00 

KERATIN, TYPE I MICRO 3.65e+00 

UROPORPHYRIN-III C-ME 3.65e+00 

INITIATION FACTOR IF- 5.33e+00 

TYROSINE-PROTEIN KINA 5.33e+00 

KERATIN, TYPE I CYTOS 1.12e+01 

COLICIN V SECRETION A 1.12e+01 

DNA TOPOISOMERASE I ( 1.12e+01 

PYRUVATE CARBOXYLASE 1.12e+01 

PHOSPHATE SYSTEM POSI 1.12e+01 

DNA TOPOISOMERASE I < 1.12e+01 

INTERFERON- INDUCED GT 1.12e+01 

PEROXISOMAL HYDRATASE 1.12e+01 

DNA TOPOISOMERASE I < 1.12e+01 

VAV ONCOGENE. 1.61e+01 

ENDO-1 , 4-BETA-XYLANAS 1.61e+01 

PHYTOENE SYNTHASE PRE 1.61e+01 

PERIPLASMIC [ N I FE ] HY 1.61e+01 

7-ALPHA-HYDROXYSTERDI 1.61e+01 

EP I DERM IN BIOSYNTHESI 1.61e+01 

VAV PROT0-0NCOGENE. 1.61e+01 
KERATIN, TYPE I CYTOS 1.6le+01 

SDC-3 PROTEIN. 2.30e+01 
HYPOTHETICAL 128.5 KD 2.30e+01 
ALPHA-CATENIN (CADHER 2.30e+01 

PROTEIN CRAC. 2.30e+01 
ALPHA-CATENIN (102 KD 2.30e+01 
50S RIBOSOMAL PROTEIN 2.30e+01 
22.0 KD CLASS IV HEAT 2.30e+01 



ALIGNMENTS 



RESULT 1 no A . 

ID FELB.FELCA STANDARD; PRT; 88 AA. 

AC P30439; 

DT 01-APR-1993 (REL. 25, CREATED) 

DT 01-APR-1993 (REL. 25, LAST SEQUENCE UPDATE) 



M MAJOR ALLERGEN" I "POLYPEPTIDE CHAIN 1 MINOR FORM PRECURSOR (FEL D I) 

DE (CAT-1) (AG 4). 

GN CHI. 

OC EUKARYOTA fMETAZOA; CHORDATA; VERTEBRATAf TETRAPODA; MAMMALIA; 

OC EUTHERIA; CARNIVORA. 

[n ^ on 

RP SEQUENCE FROM N.A., AND SEQUENCE OF 19-88. 

RA MORGENSTERN J. P., GRIFFITH I.J., BRAUER A.M., ROGERS B.L., 

RA BOND J.F., CHAPMAN M.D.. KUO M.-C; 

RL PROC. NATL. ACAD. SCI. U.S.A. 88;9690-9694(1991> . 

RN [23 

RP SEQUENCE FROM N.A. 

M GRIFFITH I.J., CRAIG S.. POLLOCK J., YU X.-B., MORGENSTERN J.P., 

RA ROGERS B.L., 

RL GENE 113:263-268(1992). 

RP SEQUENCE OF 19-58, AND CHARACTERIZATION. 

M DWFORtVa.. CARREIRA J. * MITT I G. , POLO F . , LOMBARDERO H.I 

RL MOL. IMMUNOL. 28:301-309(1991). 

RN C41 

RP CHARACTERIZATION. 

RA LE HERMANN K. » OHMAN J.L. JR.? 

RL J ALLERGY CLIN. IMMUNOL. 745147-153(1991) . 

cr -'- DISEASE: MAJOR ALLERGEN PRODUCED BY THE DOMESTIC CAT. 

CC SUBUNIT5 HETEROTETRAMER COMPOSED OF TWO NQN-COVALENTLY LINKED 

CC DISULF IDE-LINKED HETERODIMER OF CHAINS 1 AND 2. 

CC -'- TISSUE SPECIFICITY; SALIVA, AND SEBACEOUS GLANDS. 

CC -i- ALTERNATIVE PRODUCTS? USAGE OF TWO DIFFERENT INITIATOR MET ARE 

CC ' RESPONSIBLE FOR THE PRODUCTION OF TWO FORMS OF THE SIGNAL SEQUENCE 

CC OF THIS ALLERGEN SUBUNIT. 

CC -!- SIMILARITY: TO UTEROGLOBIN. 

DR EMBL; M74953; FDFELDIB. 

DR PIRf JC1126; JC1126, 

DR PROSITE! PS00403; UTER0GL0B1N_1. 

DR PROSITE; PS00404; UTER0GL0BIN_2. 

kw allergen; signal; ALTERNATIVE SPLICING. 

MAJOR ALLERGEN I POLYPEPTIDE CHAIN 1. 
INTERCHAIN (POTENTIAL) . 
INTERCHAIN (POTENTIAL). 
K -> N. 

L -> V (IN REF. 2) . 
39445 CN! 

iut> ,..„,... .00. OX? Predicted No. 1.03e-27; 
Matches 27; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Db 47 kalpvvlenarilkncvdaknteedke 73 

II Mil MIIU M MMMUMMI 
Qy 1 KALPVVLENARILKNCVDAKMTEEDKE 27 



FT 


SIGNAL 


1 


18 


FT 


CHAIN 


19 


88 


FT 


DISULFID 


21 


21 


FT 


DISULFID 


88 


88 


FT 


VARIANT 


47 


47 


FT 


CONFLICT 


78 


78 


SQ 


SEQUENCE 


88 aa; 


9614 MW, 


DB 


2; Score 


184? 


Match 



RESULT 2 a a 

ID FELA_FELCA STANDARD; PRT; 92 AA. 

AC P30438; 

DT 01-APR-1993 (REL. 25, CREATED) 

DT 01-APR-1993 (REL. 25, LAST SEQUENCE UPDATE) 

m Ol-JUN-1994 (REL. 29, LAST ANNOTATION UPDATE) 

Se MAJOR ALLERGEN I POLYPEPTIDE CHAIN 1 MAJOR FORM PRECURSOR (FEL D I) 

DE (CAT-1) (AG 4). 

GN CHI. 

-i? cz rc "*tiic rr&r^ ■■■ 



oc eukaryota;"metazoaj chordata; vertebrata; tetrapoda; mammalia; 

oc eutheria; carnivora. 

RK 111 

RF SEQUENCE FROM N.A.r AND SEQUENCE OF 23-92. 

RC TISSUE=SALIVARY GLAND; 

RA MORGENSTERN J.P. » GRIFFITH I.J.. BRAUER A.M., ROGERS B.L.. 

RA BOND J.F., CHAPMAN M.D., KUO M.-C.5 

RL PROC. NATL. ACAD. SCI. U.S.A. 88:9690-9694(1991). 

RN C23 

RP SEQUENCE FROM N.A. 

W GRIFFITH I.J., CRAIG S., POLLOCK J., YU X.-B., MORGENSTERN J.P., 

RA ROGERS B.L. r 

RL GENE 113:263-268(1992). 

RN C31 

RP SEQUENCE OF 23-62. AND CHARACTER I Z AT ION . 

RA DUFFORtYa., CARREIRA J. i NITTI G. , POLO F., LOMBARDERO N.S 
RL MOL. IMMUNOL. 28:301-309(1991). 
RN [41 

RP CHARACTERIZATION. 

RA LE HERMANN K., OHMAN J.L. JR.; 

RL J. ALLERGY CLIN. IMMUNOL. 74:147-153(1991). 

rr -'- DISEASE: MAJOR ALLERGEN PRODUCED BY THE DOMESTIC CAT. 

CC -'- SUBUNIT: HETEROTETRAMER COMPOSED OF TWO NON-COVALENTLY LINKED 

CC DISULFIDE-LINKED HETERODIMER OF CHAINS 1 AND 2. 

T -'- TISSUE SPECIFICITY: SALIVA, AND SEBACEOUS GLANDS. 

CC -'- ALTERNATIVE PRODUCTS: USAGE OF TWO DIFFERENT INITIATOR MET ARE 

CC ' RESPONSIBLE FOR THE PRODUCTION OF TWO FORMS OF THE SIGNAL SEQUENCE 

CC OF THIS ALLERGEN SUBUNIT . 

CC -!- SIMILARITY: TO UTEROGLOBIN. 

DR EMBL; M74952; FDFELDI . 

DR PIR? JC1136: JC1136. 

DR PROSITEJ PS00403; UTEROGLOBIN^ . 

DR PROSITEJ PS00404; UTER0GL0BIN_2. 

KW ALLERGEN; SIGNAL; ALTERNATIVE SPLICING. 

FT CHAIN 1 " 23 92 MAJOR ALLERGEN I POLYPEPTIDE CHAIN 1. 

FT DIBULFID 25 25 INTERCHAIN (POTENTIAL) . 

FT DISULFID 92 92 INTERCHAIN (POTENTIAL). 

FT VARIANT 51 51 K -> N. 

FT CONFLICT 5 5 R -> C (IN REF. 2). 

FT CONFLICT 18 18 M -> S (IN REF. 2). 

FT CONFLICT 82 82 L -> V (IN REF. 2). 

SQ SEQUENCE 92 AA; 10252 MM; 43206 CN; 

DB 2! Score 184; Match 100.07.; Predicted No. 1.03e-27; 



Matches 



27; Conservative 0; Mismatches 0; Indels Of Gaps 



Db 51 kalpvvlenarilkncvdakrsteedke 77 

1 1 1 i 1 1 1 i 1 1 1 1 i 1 1 1 1 1 1 1 1 1 i I i 1 1 

Qy 1 KALPVVLENARILKNCVDAKMTEEDKE 27 



RESULT 3 

ID POLG.FMDVl STANDARD; PRT; 2333 AA. 

AC P03306; 

DT 21-JUL-1986 (REL. 01 , CREATED) 

DT 21-JUL-1986 (REL. 01, LAST SEQUENCE UPDATE) 

DT 01-0CT-1994 (REL. 30, LAST ANNOTATION UPDATE) 

DE GENOME POLYPROTEIN (NONSTRUCTURAL PROTEIN P20A; COAT PROTEINS VP1 TO 

DE VP4J CORE PROTEIN P52J GENOME-LINKED PROTEINS VPG1 TO VPG3! PICORNAIN 

DE 3C (EC 3.4.22.28) (PROTEASE 3C) (P3C) ; RNA-DIRECTED RNA POLYMERASE 

DE (EC 2.7.7.48)) . 



oc 

RN 

RP 

RM 

RA 

RL 

RN 

RP 

RM 

RA 

RL 

CC 

CC 

CC 

CC 

DR 

DR 

KW 

KU 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

SQ 



VIRIDAE; SS-RNA NQNENVELOPED VIRUSES; Plt,UKNfwiKiuHcr m-mnuvirw^. 
[13 

SEQUENCE FROM N.A. 
84169547 

CARROLL A.R.. ROWLANDS D.J.. CLARKE B.E.? 
NUCLEIC ACIDS RES. 12 :2461-2472 ( 1984) . 
[23 

SEQUENCE OF 115-1048 FROM N.A. 
8221 1814 

BOOTHROYD J.C.. HARRIS T.J.R., ROWLANDS D.J., LOWE P. A.; 

-^ThI^PECIFIcInZYNATIC CLEAVAGES IN VIVO YIELD MATURE PROTEINS. 
-'- SUBUNIT: THE VIRUS CAPS1D IS COMPOSED OF 60 ICOSAHEDRAL UNITS, 

EACH OF WHICH IS COMPOSED OF ONE COPY EACH OF PROTEINS VP1, VP2, 

VP3, AND VP4. 
EMEL; X004295 PIFMDVl. 

POLYPROTEIN; COa/pROTEIN; CORE PROTEIN J RNA-DIRECTED RNA POLYMERASE J 
HYDROLASE; THIOL PROTEASE; MYRISTYLATIQN. 
CHAIN 
CHAIN 
CHAIN 
CHAIN 
CHAIN 
CHAIN 
CHAIN 
CHAIN 
CHAIN 
CHAIN 
CHAIN 
LIPID 
CONFLICT 
CONFLICT 
SEQUENCE 



1 


201 


NONSTRUCTURAL PROTEIN P20A. 


202 


286 


COAT PROTEIN VP4. 


287 


504 


COAT PROTEIN VP2. 


505 


725 


COAT PROTEIN VP3. 


726 


937 


COAT PROTEIN VP1. 


938 


1578 


CORE PROTEIN P52. 


1579 


1601 


GENOME-LINKED PROTEIN VPG1. 


1602 


1625 " 


GENOME-LINKED PROTEIN VPG2. 


1626 


1649 


GENOME-LINKED PROTEIN VPG3. 


1650 


1863 


PROTEASE P20B. 


1864 


2333 


RNA-DIRECTED RNA POLYMERASE P56A. 


2Q2 


202 


MYRISTATE. 


396 


396 


S -> C (IN REF. 2). 


632 


632 


P -> L (IN REF. 2) . 


2333 


AA; 259645 


MWJ 19388774 CNJ 



DB 5! Score 79; Match 59.1'/.; Predicted No. 3.09e-03; 

Matches 13; Conservative 4; Mismatches 51 Indels 



0; Gaps 



Db 1913 vvlddvif skhkgdaknteedk 1934 

III-, s M I lllll IN 
Qy 5 VVLENARILKNCVDAKMTEEDK 26 



RESULT 4 

ID POLG.FMDVO STANDARD; PRT; 2332 AA. 

AC P03305; 

DT 21-JUL-1986 IREL. 01, CREATED) 

DT 21-JUL-1986 (REL. 01, LAST SEQUENCE UPDATE) 

DT 01-0CT-1994 (REL. 30, LAST ANNOTATION UPDATE) MnT - TMC UP1 Tn 

DE GENOME POLYPROTEIN (NONSTRUCTURAL PROTEIN P20A; COAT PROTEINS VP l^TO 

DE VP4 ; CORE PROTEINS P12, P34, P14S GENOME-LINKED PROTEIN VPG; PROTEASE 

DE (EC 3.4.22.-); RNA-DIRECTED RNA POLYMERASE (EC 2.7.7.48)). 

OS FOOT-AND-MOUTH DISEASE VIRUS (STRAINS OIK AND 01BFS) (APHTHOV RUS 0) . 

OC VIRIDAE; SS-RNA NONENVELOPED VIRUSES; PICORNAVIRIDAE; APHTHOVIRUSES. 

RN [13 

RP SEQUENCE FROM N.A. 

RC STRAIN=01K: 

RM 84297249 

RA FORSS S., STREBEL K., BECK E., SCHALLER H.I 

RL NUCLEIC ACIDS RES. 12:6587-6601(1984). 

RN C23 

RP SEQUENCE FROM N.A. 

RC STRAIN=01BFS; 

RA MAKOFF A.J., PAYNTER C.A., ROWLANDS D.J., BOOTHROYD J.C.; 

^ »r r "i pi - »--"c ore J"!«« e oooeriigcov 



RN 
RP 
RM 
RA 
RL 
CC 

cc 

CC 

cc 
cc 
cc 
cc 
cc 

DR 

DR 

DR 

KU 

KW 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

SQ 



D. , FOX G. » ROWLAMDS D., BROWN F, 



[33 

X-RAY CRYSTALLOGRAPHY (2.9 ANGSTROMS) 
89143740 

ACHARYA R.. FRY E., STUART 
NATURE 337:709-716(1989). 

-!- THE STRAIN OIK SEQUENCE IS SHOWN. onnT n™e 
-«- PTM! SPECIFIC ENZYMATIC CLEAVAGES IN VIVO YIELD MATURE PROTEINS. 
-•- THE COAT PROTEIN VP1 CONTAINS THE MAIN ANTIGENIC DETERMINANTS OF 

THE VIRION; THEREFORE, CHANGES IN ITS SEQUENCE MUST BE 

RESPONSIBLE FOR THE HIGH ANTIGENIC VARIABILITY OF THE VIRUS. 
-'- SUBUNIT; THE VIRUS CAPSID IS COMPOSED OF 60 ICOSAHEDRAL UNITS, 

EACH OF WHICH IS COMPOSED OF ONE COPY EACH OF PROTEINS VP1. VP2, 

VP3r AND VP4. 



embl; X00871; 

EMBL5 J02185; 
PIR; A03907; 

polyprotein; 
hydrolase; 

CHAIN 



PIFMDV2. 
PI01VP. 
GNNYF. 

COAT protein; 
THIOL PROTEASE? 
1 201 



CHAIN 

CHAIN 

CHAIN 

CHAIN 

CHAIN 

CHAIN 

CHAIN 

CHAIN 

CHAIN 

CHAIN 

LIPID 

DISULFID 

DISULFID 

VARIANT 

VARIANT 

VARIANT 

SEQUENCE 



202 
287 
505 
725 
938 
1108 
1426 
1579 
1650 
1863 
202 
511 
406 
780 
808 
861 
2332 



aa; 



286 
504 
724 
937 
1107 
1425 
1578 
1649 
1862 
2332 
202 
511 
858 
780 
808 
861 
258924 



CORE PROTEIN; RNA 
MYRISTYLATION. 

NONSTRUCTURAL PROTEIN P20A. 
COAT PROTEIN VP4. 



DIRECTED RNA POLYMERASE! 



VP2. 
VP3. 
VP1. 
P12. 
P34. 
P14. 
PROTEIN 



VPG. 



COAT PROTEIN 
COAT PROTEIN 
COAT PROTEIN 
CORE PROTEIN 
CORE PROTEIN 
CORE PROTEIN 
GENOME-LINKED 
PROTEASE. 

RNA-DIRECTED RNA POLYMERASE. 
MYRISTATE. 

(IN VP3 DIMER) . 
DIMER. 
STRAIN 
STRAIN 
STRAIN 



INTERCHAIN 
IN VP2-VP1 
I -) V (IN 
G -> R (IN 
N -> S (IN 



01BFS) 
01BFS) 
01BFS) 



MW! 19411374 CNJ 



DB 5; Score 

Matches 11; Conservative 



72; Match 50.07.; Predicted No. 6.49e-02; 

6; Mismatches 5; Indels 0; 



Gaps 



Db 1912 vvldevif skhkgdtknseedk 1933 

HI" : P I - i I M i 1 I 
Qy 5 VVLENARILKNCVDAKMTEEDK 26 



RESULT 5 

ID YLD1_CAEEL STANDARD; PRT; 374 AA. 

AC Q03566; 

DT 01-FEB-1994 (REL. 28, CREATED) 

DT 01-FEB-1994 (REL. 28, LAST SEQUENCE UPDATE) 

DT 01-JUN-1994 (REL. 29, LAST ANNOTATION UPDATE) 

DE PROBABLE G PROTEIN-COUPLED RECEPTOR C38C10.1 IN CHROMOSOME III. 

GN C38C10.1. 

oc eukaryota^metazoa; acoelemates; nematoda; SECERNENTEA; RHABDITIDA. 

rn m 

RP SEQUENCE FROM N.A. 

RC STRAIN=BRISTOL N2; 

RA WILSON 1 !?., AINSCOUGH R.. ANDERSON K. , BAYNES C, BERKS 

RA BONFIELD J., BURTON J., CONNELL M. , COPSEY T., COOPER J., COULSON A.. 

RA CRAXTON M. , DEAR S.. DU Z., DURBIN R. > FAVELLO A., FRASER A. , 

RA FULTON L. , GARDNER A., GREEN P., HAWKINS T., HILLIER L., JIER M., 

da JOHNSTON L., JONES M. , KERSHAW J., KIRSTEN J., LAISSTER N. , 

RA StREILLE P . LIGHTNING J., LLOYD C, MORTIMORE B., Q'CALLAGHAN M. , 

o, oil*- = D»r« » — orr,pp- . -.vncoc. 



RA SIRS M. i SMALDON N., SMITH A., SMITH M., SONNHARMER E., ST ADEN R., 

RA SULSTON J., THIERRY-MIEG J. > THOMAS K., VAUDIN M. , VAUGHAN K., 

RA WATERSON R.r WATSON A.. WEINSTOCK L.r WILKINSON-SPROAT J., 

RA WOHLDMAN P.; 

RL NATURE 368:32-38(1994). 

CC -■- FUNCTION: NOT KNOWN. PUTATIVE RECEPTOR. 

CC -'- SUBCELLULAR LOCATION : INTEGRAL MEMBRANE PROTEIN (POTENTIAL) . 

CC -'- SIMILARITY: BELONGS TO FAMILY 1 OF G-PROTEIN COUPLED RECEPTORS. 

CC MOST SIMILAR TO TACHYKININS RECEPTORS. 

DR EMBL; Z 19153! CEC38C10. 

DR PIR? S28285; S28285. 

DR WORMPEP; C38C10.1; CE00104. 

DR GCRDBJ GCR 0567; -. 

DR PROSITEJ PS00237; G PROTEIN.RECEPTOR. 

KW HYPOTHETICAL PROTEIN; G-PROTEIN COUPLED RECEPTOR; TRANSMEMBRANE; 

KW GLYCOPROTEIN. 

SQ SEQUENCE 374 AA! 42940 MW; 769122 CN; 

DB 7J Score 71; Match 22.27.; Predicted No. 9.89e-02; 

Matches 6; Conservative 14; Mismatches 7; Indels 0; Gaps 0; 

Db 322 rsnaislqkgrvnsscldkkvkenssq 348 

|:;?|: MM I s l ! ; 
Qy 1 KALPVVLENAR ILKNCVDAKMTEEDKE 27 



RESULT 6 

ID YCTP_ASTLO STANDARD; PRT; 136 AA. 

AC P34776; 

DT Oi-FEB-1994 (REL. 28, CREATED) 

DT OI-FEB-1994 (REL. 28, LAST SEQUENCE UPDATE) 

DT OI-FEB-1994 (REL. 28, LAST ANNOTATION UPDATE) 

DE HYPOTHETICAL PROTEIN IN TRNP 5' REGION (FRAGMENT). 

OS ASTASIA LONGA (EUGLENOPHYCEAN ALGA). 

OG CHLOROPLAST. 

oc eukaryota; planta; phycophyta; euglenophyta. 

RN [13 

RP SEQUENCE FROM N.A. 

RC STRAIN=CCAP 1204-17A! 

RA GOCKEL G. , BAIER S.. HACHTEL W.; 

RL SUBMITTED (NOV-1993) TO EMBL/GENBANK/DDBJ DATA BANKS. 

DR EMBL; X75653J ALRIBPTR. 

DR PIR; S38598; S38598. 

KW CHLOROPLAST; HYPOTHETICAL PROTEIN. 

FT NON TER I 1 

SQ SEQUENCE 136 AA; 16587 MW; 103277 CN; 

DB 7', Score 71 f Match 46.77.; Predicted No. 9.89e-02; 

Matches 7; Conservative 5; Mismatches 3; Indels 0; Gaps 

Db 120 Idddrilnvcvitrn 134 
I- MP II :; l 

Qy 7 LENARILKNCVDAKM 21 



RESULT 7 

ID LDHH_PIG STANDARD; PRT; 333 AA. 

AC P00336; 

DT 21-JUL-1986 (REL. 01, CREATED) 

DT 21-JUL-1986 (REL. 01, LAST SEQUENCE UPDATE) 

DT 01-MAR-1992 (REL. 21, LAST ANNOTATION UPDATE) 

DE L-LACTATE DEHYDROGENASE H CHAIN (EC 1.1.1.27) (LDH-B) . 

OS SUS SCROFA (PIG). 4MM „ ia . 

OC EUKARYOTA; METAZOA? CHORDATA ; VERTEBRATA; TETRAPODA? MAMMALIA, 

OC EUTHERIA; ARTIODACTYLA. 



RP 

RM 

RA 

RL 

RN 

RP 

RA 

RL 

RN 

RP 

RM 

RA 

RL 

CC 

CC 

CC 

CC 

CC 

CC 

DR 

DR 

DR 

KW 

KW 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

SQ 



W.r GRIESBACH 
PHYSIOL. CHEM. 



SEQUENCE. 
77117453 

KILTZ H.-H., KEIL 
HOPPE-SEYLER'S Z. 
C21 

REVISIONS TO 21! 147; 215 AND 217. 
KILTZ H.-H.; 

SUBMITTED (OCT-1977) TO THE PIR DATA BANK. 
[33 

X-RAY CRYSTALLOGRAPHY (2.7 ANGSTROMS) . 
32170431 

GRAU U.M.r TROMMER U.E.r ROSSMANN 



., PETRY K., MEYER 
358:123-127(1977) . 



M.G. 



NAD (+) = PYRUVATE + NADH. 



MOL. BIOL. 1515289-307(1981). 

- CATALYTIC ACTIVITY: L-LACTATE 

- SUBUNIT: HOMOTETRAMER. 

- PATHWAY: FINAL STEP IN ANAEROBIC GLYCOLYSIS. 

- THERE ARE THREE TYPES OF LDH CHAINS? M (LDH-A) FOUND PREDOMINANTLY 



IN MUSCLE TISSUES? H (LDH-B) 
WHICH IS PRESENT ONLY IN THE 
PIR; A00345J DEPGLH. 

pdbj 5ldh; 16-apr-88. 
prosite; psooom; l_ldh. 
oxidoreductase; nad; glycolysis; 
3d-structure. 



FOUND IN HEART MUSCLE AND X (LDH-C) 
SPERMATOZOA OF MAMMALS AND BIRDS. 



MULTIGENE FAMILY; ACETYLATION; 



MOD_RES 

ACT_SITE 

STRAND 

HELIX 

STRAND 

HELIX 

STRAND 

HELIX 

STRAND 

TURN 

TURN 

HELIX 

TURN 

HELIX 

STRAND 

HELIX 

HELIX 

STRAND 

TURN 

HELIX 

STRAND 

STRAND 

STRAND 

STRAND 

TURN 

HELIX 

HELIX 

TURN 

STRAND 

STRAND 

TURN 

STRAND 

HELIX 

SEQUENCE 



1 

193 
23 
30 
48 
56 
77 
85 
91 
97 
106 
111 
119 
121 
132 
140 
154 
158 
162 
164 
186 
189 
198 
205 
214 
227 
246 
266 
271 
293 
297 
299 
312 
333 



AA! 



1 

193 
25 
42 
50 
71 
78 
87 
94 
98 
107 
118 
120 
127 
135 
151 
157 
159 
163 
178 
186 
190 
199 
205 
217 
239 
265 
266 
271 
296 
29S 
303 
330 
36476 



ACETYLATION. 
ACCEPTS A PROTON DURING CATALYSIS. 



DB 45 
Matches 



Score 
9? 



68; Match 
Conservative 



MW; 600619 CN; 

37.5'/.; Predicted 
8; Mismatches 



No. 
6! 



3.42e-01; 
Indels 



l; Gaps l; 



Db 290 slpcvl-nargltsvinqk Ikdde 312 

Ml II III I : ;! I s ;;: 
Qy 2 ALPVVLENARI LKNCVDAKMTEED 25 



RESULT 8 ^ 

ID LDHH_HUMAN STANDARD; PRTf 333 AA. 

AC PQ7195; 

DT 01-APR-1988 (REL. 07. CREATED) 

DT 01-APR-1988 (REL. 07, LAST SEQUENCE UPDATE! 

DT 01-JUN-1994 (REL. 29, LAST ANNOTATION UPDATE) 

DE L-LACTATE DEHYDROGENASE H CHAIN (EC 1.1.1.27) (LDH-B) . 

GN LDHB. 

Sc eukaryotaT metazoa; chordata; vertebrata; tetrapoda; mammalia; 

oc eutheria; primates, 

rn en 

RP sequence from n.a. 

RM 89193506 

RA TAKENO T.r LI S.S.-L.; 

RL BIOCHEM. J. 2575921-924(1989). 

RN [23 

rp sequence from n.a. 

rc tissue=t-cell; 

RM 88133965 , , - , . 

RA SAKAI I., SHARIEF F.S., PAN Y.-C.E., ul S.S.-L.. 

RL BIOCHEM. J. 248:933-936(1987). 

RN E3] 

RP VARIANT GLU-6. 

RA MAEKAWaV, SUDO K., KITAJIMA M.. MATSUURA Y., LI S.S.-L., KANNO T.5 

RL HUM. GENET. 91:423-426(1993). 

RN C4 3 

RP VARIANTS GLU-34; VAL-170 AND LEU-174. 

Rti 93216^83 

RA MAEKAWA M. , SUDO K. , KITAJIMA MATSUURA Y., LI S.S.-L., KANNO T.5 

RL HUM. GENET. 91:163-168(1993). 

RN 15 3 

RP VARIANTS ARG-128 AND HIS-171. 

RM 92267519 

RA SUDO K., MAEKAWA M. , TOMONAGA A. , TSUKADA T., NAKAYAMA T., 

RA KITAMURA M. , LI S.S.-L., KANNO T., TORIUMI J.f 

RL HUM. GENET. 89:158-162(1992). 

RN [63 

RP VARIANT HIS-171. 

RA SUDO 1 ^! MAEKAWA M. , IKAWA S., MACHIDA K. , KITAMURA M. , LI S.S.-L.; 

RL BIOCHEM. BIOPHYS. RES. COMMUN. 168:672-676(1990). 

cc _,_ CATALYTIC ACTIVITY : L-LACTATE + NAD(+> = PYRUVATE + NADH. 

CC -'- SUBUNIT: HOMOTETRAMER. 

rc -'- PATHWAY: FINAL STEP IN ANAEROBIC GLYCOLYSIS. 

fc THERE ARE THREE TYPES OF LDH CHAINS: M (LDH-A) FOUND PREDOMINANTLY 

rC ' IN MUSCLE TISSUES, H (LDH-B) FOUND IN HEART MUSCLE AND X (LDH-C) 

CC WHICH IS PRESENT ONLY IN THE SPERMATOZOA OF MAMMALS AND BIRDS. 

CC -!- DISEASE: LDHB DEFICIENCY PROBABLY HAS NO CLEAR SYMPTOMATIC 

CC CONSEQUENCES. 

DR EMEL; Y00711; HSLDHBR. 

DR EMEL', X 13794 5 HSLDHB1 . 

DR EMBL; XI 37955 HSLDHB3. 

DR EMBL; X137965 HSLDHB4. 

DR EMBL; X13797! HSLDHB5. 

DR EMBL; X 13798! HSLDHB6. 

DR EMBL? X 13799J HSLDHB7. 

DR EMBL; XI 3800! HSLDHB8. 

DR PIR; S02795; S02795. 

DR MIM; 150100; llTH EDITION. 

Z S^CUsT^COUSIS, MULT1GENE FAMILY: DISEASE MUTATION. 

FT TNIT MET 0 0 



FT VARIANT 


6 


6 


FT 






FT VARIANT 


34 


34 


FT VARIANT 


128 


128 


FT VARIANT 


170 


170 


FT VARIANT 


171 


171 


FT VARIANT 


174 


174 


SQ SEQUENCE 


■m a a • 

333 AAi 


ODjU / 


DB 45 Score 


66; 


Hatch 


Hatches 8 


; Conservative 



K -> E (IN LDHB DEFICIENCY.; SLIGHTLY 
DECREASED ACTIVITY). 



-> 
-> 
-> 
-> 
-> 



E (IN LDHB DEFICIENCY) . 

R (IN LDHB DEFICIENCY). 

V (IN LDHB DEFICIENCY). 

H (IN LDHB DEFICIENCY? UNSTABLE) 

L (IN LDHB DEFICIENCY) . 



598763 CNJ 



5.3X; Predicted No. 7.66e-01i 
9; Mismatches 6; Indels 



Gaps 



Db 290 slpcil-nargltsvinqklkdde 312 

Ml M III I ; ;; I s ::: 
Qy 2 ALPVVLENARILKNCVDAKMTEED 25 



RESULT 



ID 

AC 

DT 

DT 

DT 

DE 

DE 

DE 

DE 

OS 

OC 

RN 

RP 

RM 

RA 

RA 

RL 

RN 

RP 

RM 

RA 

RA 

RL 

RN 

RP 

RM 

RA 

RA 

RA 

RL 

CC 

CC 

CC 

CC 

DR 

DR 

KW 

KW 



GRUBMAN M.J.f CARD J.» 



POLG_FMDVA STANDARD; PRT J 2332 AA. 

P03308; P03312J 

21-JUL-1986 tREL . 01, CREATED) 
01-JAN-19S8 (REL . 06, LAST SEQUENCE UPDATE! 
01-0CT-1994 (REL. 30r LAST ANNOTATION UPDATE) 

GENOME POLYPROTEIN (NONSTRUCTURAL PROTEIN P20A; COAT PROTEINS VP I TO 
vP4; CORE PROTEINS X, P14, P41, P19S GENOME-LINKED PROTEINS VPG1 TO 
VPG3J PICORNAIN 3C (EC 3.4.22.28) (PROTEASE 3C) (P3C) J RNA-DIRECTED 
RNA POLYMERASE (EC 2.7.7.485). 

FOOT-AND-MOUTH DISEASE VIRUS (STRAIN A12) (APHTHOVIRUS A). 
VIRIDAEi SS-RNA NONENVELOPED VIRUSES; picornaviridae; APHTHOVIRUSES. 
C 1 1 

SEQUENCE FROM N.A. 

ROBERTSON B.H., GRUBMAN M.J.r WEDDELL G.N.. MOORE D.M., WELSH J.D., 
FISCHER T., DOWBENKO D.J.. YANSURA D.G.. SMALL B. , KLEID D.G.; 
J. VIROL. 54:651-660(1985). 
C23 

SEQUENCE OF 1863-2332 FROM N.A. 
83225613 

ROBERTSON B.H.> MORGAN D.O.. MOORE D.H.. 
FISCHER T.» WEDDELL G.N., DOWBENKO D.J.. YANSURA D.G. 
VIROLOGY 1261614-623(1983) . 
C3] 

SEQUENCE OF 715-955 FROM N.A. 
82061853 

KLEID D.G. r YANSURA D.G., SMALL B. 
GRUBMAN M.J.. MCKERCHER P.D.r MORGAN D.O., ROBERTSON B.H.. 
BACHRACH H.L.; 

SCIENCE 214:1125-1129(1981). 

PTM: SPECIFIC ENZYMATIC CLEAVAGES IN VIVO YIELD MATURE PROTEINS . 
-i- SUBMIT* THE VIRUS CAPS1D IS COMPOSED OF 60 ICOSAHEDRAL UNITS. 

EACH OF WHICH IS COMPOSED OF ONE COPY EACH OF PROTEINS VP1. VP2. 

VP3. AND VP4. 
EMBL; M10975! APHA12CD. 
PIR' A25794; GNNY4F. 

POLYPROTEIN; COAT PROTEIN; CORE PROTEIN; RNA-DIRECTED RNA POLYMERASE; 

MYRISTYLATION. 

NONSTRUCTURAL PROTEIN P20A. 
COAT PROTEIN VP4. 
COAT PROTEIN VP2. 
COAT PROTEIN VP3. 
COAT PROTEIN VP1. 
CORE PROTEIN X. 
CORE PROTEIN P14. 
CORE PROTEIN P41. 
CORE PROTEIN P19. 



DOWBENKO D.J.i MOORE D.M.. 



HYDROLASE; THIOL PROTEASE; 



FT 


CHAIN 


1 


200 


FT 


CHAIN 


201 


285 


FT 


CHAIN 


286 


503 


FT 


CHAIN 


504 


723 


FT 


CHAIN 


724 


937 


FT 


CHAIN 


938 


953 


FT 


CHAIN 


954 


1107 


FT 


CHAIN 


1108 


1425 


FT 


CHAIN 


1426 


1578 


:t 


ru 


1 570 





FT 


CHAIN 


1602 


1625 


GENOME-LINKED PROTEIN VPG2. 


FT 


CHAIN 


1626 


1649 


GENOME-LINKED PROTEIN VPG3. 


FT 


CHAIN 


1650 


1862 


PROTEASE. 


FT 


CHAIN 


1863 


2332 


RNA-DIRECTED RNA POLYMERASE. 


FT 


LIPID 


201 


201 


MYRISTATE. 


SO 


SEQUENCE 


2332 


AA? 259408 


MW5 19347576 L-N / 



DB 5i Score 66; Match 45.57.! Predicted No. 7.66e-01? 

Matches 10? Conservative 6; Mismatches 6; Indels 0; Gaps 

Db 1912 vvldevif skhkgdtkmsaedk 1933 

ill:: s I : IMP Ml 
Gu 5 VVLENARILKNCVDAKMTEEDK 26 



RESULT 10 

ID LDHH_RABIT STANDARD? PRT? 217 AA. 

AC P 13490? 

DT 01-JAN-1990 IREL. 13, CREATED) 

DT 01-JAN-1990 (REL. 13, LAST SEQUENCE UPDATE) 

DT 01-JAN-1990 (REL. 13, LAST ANNOTATION UPDATE) 

DE L-LACTATE DEHYDROGENASE H CHAIN (EC 1.1.1.27) (LDH-B) (FRAGMENT). 

OS ORYCTOLAGUS CUNICULUS (RABBIT). 

OC EUKARYOTA? METAZOA? CHORDATA? VERTEBRATA J TETRAPQDA? MAMMALIA ? 

OC EUTHERIA? LAGOMORPHA. 

RN £11 

RP SEQUENCE FROM N.A. 

RM 89139477 

RA SASS C, BRI AND M. , BENSLIMANE S., RENAUD M. , BRIAND Y. ? 

RL J. BIOL. CHEM. 264?4076-4081 (1989) . 

CC -!- CATALYTIC ACTIVITY? L-LACTATE + NAD (+) = PYRUVATE + NADH. 

CC -'- SUBUNIT? HOMOTETRAMER. 

rc PATHWAY? FINAL STEP IN ANAEROBIC GLYCOLYSIS. 

CC - ; - THERE ARE THREE TYPES OF LDH CHAINS? M (LDH-A) FOUND PREDOMINANTLY 

CC ' IN MUSCLE TISSUES, H (LDH-B) FOUND IN HEART MUSCLE AND X (LDH-C) 

CC WHICH IS PRESENT ONLY IN THE SPERMATOZOA OF MAMMALS AND BIRDS. 

DR EMBL ? M22584? QCLDHH. 

DR PIR? B32957? B32957. 

DR PROSITE? PS00064? L_LDH. 

KW OXIDOREDUCTASE? NAD? GLYCOLYSIS? MULTIGENE FAMILY. 

FT ACT~SITE 77 77 ACCEPTS A PROTON DURING CATALYSIS. 

SQ SEQUENCE 217 AA? 24134 MW? 249993 CN? 

DB 4? Score 66? Match 33.37.? Predicted No. 7.66e-0l: 

Matches 8? Conservative 9? Mismatches 6? Indels 1? Gaps 

Db 174 slpcil-nargltsvinqklkdde 196 

?|| : I III I : :; l ; :5; 
Qy 2 ALPVVLENARILKNCVDAKMTEED 25 



RESULT 11 

ID LDHH_MOUSE STANDARD? PRT? 333 AA. 

AC P16li5? 

DT 01-APR-1990 (REL. 14, CREATED) 

DT 01-APR-1990 (REL. 14, LAST SEQUENCE UPDATE) 

DT 01-AUG-1990 (REL. 15, LAST ANNOTATION UPDATE) 

DE L-LACTATE DEHYDROGENASE H CHAIN (EC 1.1.1.27) (LDH-B). 

GN LDH-2. 

OS MUS MUSCULUS (MOUSE) . 

OC EUKARYOTA? METAZOA? CHORDATA? VERTEBRATA? TETRAPODA? MAMMALIA, 

OC EUTHERIA! RODENTIA. 

RN Ell 

RP SEQUENCE FROM N.A. 



RA 



HIRAOKA B.Y.. SHARIEF F.S.. YANG Y.W., LI W.H., LI S.S.-L.; 



CC 
CC 
CC 
CC 



RL EUR. J. BIOCHEM. 189:215-220(1990). 

CATALYTIC ACTIVITY: L-LACTATE + NAD (+) = PYRUVATE + NADH. 
SUBUNIT! HQMOTETRAMER. 

PATHWAY: FINAL STEP IN ANAEROBIC GLYCOLYSIS. 

THERE ARE THREE TYPES OF LDH CHAINS: M (LDH-A) FOUND PREDOMINANTLY 
CC ' IN MUSCLE TISSUES, H (LDH-B) FOUND IN HEART MUSCLE AND X (LDH-C) 
CC WHICH IS PRESENT ONLY IN THE SPERMATOZOA OF MAMMALS AND BIRDS. 

DR EMBL! X51905! MMLDH2. 
DR PIR; S09954J S09954. 
DR PROSITEJ PS00064? L_LDH. 

KW OXIDOREDUCTASE; NAD; GLYCOLYSIS? MULTIGENE FAMILY. 
FT INIT MET 0 0 

FT ACT SITE 193 193 ACCEPTS A PROTON DURING CATALYSIS. 

SQ SEQUENCE 333 AAi 36441 MW? 595466 CN? 

DB 4! Score 66; Match 33.3X1 Predicted No. 7.66e-0i; 

Matches 3; Conservative 9; Mismatches 6; Indels 1? Gaps 1; 

Db 290 slpcil-nargltsvinqk Ikdde 312 

Ml ; l HI I 1 ;: l ; ;:: 
Qy 2 ALPVVLENARILKNCVDAKMTEED 25 



RESULT 12 

ID MYSD.CHICK STANDARD; PRT; 1829 AA. 

AC Q02440J 

DT 01-JUN-1994 (REL. 29, CREATED) 

DT 01-JUN-1994 (REL . 29, LAST SEQUENCE UPDATE) 

DT 01-0CT-1994 (REL. 30, LAST ANNOTATION UPDATE) 

DE DILUTE MYOSIN HEAVY CHAIN, ISOFORM I. 

OS GALLUS GALLUS (CHICKEN) . U ^ MJTU . C , 

OC EUKARYOTA; METAZOA; CHORDATA; VERTEBRATA! TETRAPODA; AVES; NtOGNATHAE, 

OC GALLIFORMES. 

RN [13 

rp sequence from n.a. 

rc tissue=brain; 

RM 93012002 

RA SANDERS G. > LICHTE B. , MEYER H.E., KILIMANN M.W.! 

RL FEBS LETT. 31 1 '.295-298 ( 1992) . 

CC -!- SIMILARITY: BELONGS TO CLASS-5 MYOSINS. 

DR EMBL; X67251J GGDILUTE. 

KW MYOSIN; REPEAT; ATP-BINDINGJ CALMODUL IN-BINDING 5 

KW HEPTAD REPEAT PATTERN. 

FT NP BIND 163 170 ATP (BY SIMILARITY) . 

SQ SEQUENCE 1B29 AA; 212381 MW, 15626072 CN; 

DB 4; Score 66; Match 39.17.; Predicted No. 7.66e-0l; 

Matches 9; Conservative 6; Mismatches 8; Indels 0; Gaps 

Db 36 kvlqlrleegkdleycldplttke 58 

[ I IP" I PPI I 
8y 1 KALPVVLENARILKNCVDAKMTE 23 



RESULT 13 

ID ACRR.ECOLI STANDARD? PRT; 215 AA. 

AC P34000? 

DT 01-FEB-1994 (REL. 28, CREATED) 

DT 01-FEB-1994 (REL. 28, LAST SEQUENCE UPDATE) 

DT 01-0CT-1994 (REL. 30, LAST ANNOTATION UPDATE) 

DE POTENTIAL REPRESSOR FOR ACRAB OPERON. 

GN ACRR. 

OS ESCHERICHIA COLI. 

oc prokaryota; gracilicutes; SCOTOBACTERIA; FACULTATIVELY ANAEROBIC rods; 



RN [11 

RP SEQUENCE FROM N.A. 

RC STRAIN=K12 / W4573', 

RA MA°d!! 9 C0OK D.N.i ALBERT I M., PON N.G., NIKAIDO H., HEARST J.E.; 

RL J. BACTERIOL. 175:6299-6313(1993). 

RN C21 

RP IDENTIFICATION. 

RA RUDD K.E.5 

RL UNPUBLISHED OBSERVATIONS (DEC-1993) . 

CC -'- FUNCTION: POTENTIAL REGULATOR PROTEIN FOR THE ACRAB GENES. 

CC -!- SIMILARITY: BELONGS TO THE ACRR/TTK FAMILY. 

DR EMBL; U00734J EC734. 

DR ECOGENE5 EG12116! ACRR. 

KM TRANSCRIPTION REGULATION; DNA-BINDING; REPRESSOR. 

S6 SEQUENCE 215 AA! 24766 MW; 223841 CN; 

DB 1! Score 64; Match 46.77.; Predicted No. 1.69e+00; 

Matches 7; Conservative 4; Mismatches 4! Indels 0, Gaps 0 

Db 143 qtlkhcieakmlpad 157 

Il-I'-Ifl I 
Qy 11 RILKNCVDAKMTEED 25 



RESULT 14 

ID HDE_CANTR STANDARD; PRTJ 906 AA. 

AC P22414J 

DT 01-AUG-1991 (REL. 19, CREATED) 
DT 01-AUG-1991 (REL. 19, LAST SEQUENCE UPDATE) 
DT 01-DEC-1992 (REL. 24, LAST ANNOTATION UPDATE) 
DE HYDRATA3E-DEHYDR0GENASE-EPIMERASE (HDE) . 
OS CANDIDA TROPICALIS (YEAST). 

OC EUKARYOTAJ FUNGI; DEUTEROMYCOTINA (IMPERFECT FUNGI). 
RN EU 

RP SEQUENCE FROM N.A. 

RC STRAIN=ATCC 20336 / PK233; 

RM 89172062 

RA NUTTLEY W.M., AITCHISON J.D., RACHUBINSKI R.A.! 
RL GENE 69: 171-180(1988) . 

RP SIMILARITY TO SHORT CHAIN DEHYDROGENASES OF N-TERMINAL DOMAIN. 
RM 90367890 
RA BAKER M.E.; 

CP -'^FUNCTION: 2 SKOnT AFUNCTIONAL ENZYME ACTING ON THE BETA-OXIDATION 
CC ' PATHWAY FOR FATTY ACIDS, POSSESSING HYDRATASE-DEHYDROGENASE- 
CC EPIMERASE ACTIVITIES. 

PATHWAY: BETA-OXIDATION PATHWAY. 
INDUCTION: BY GROWTH ON N-ALKANES OR FATTY ACIDS. 
SUBUNIT: MONOMER. 

SUBCELLULAR LOCATION: PEROXISOMAL. 
-•- SIMILARITY: THE N-TERMINAL PART CONTAINS TWO COPIES OF INSECT-TYPE 
CC ' ALCOHOL DEHYDROGENASE / RIBITOL DEHYDROGENASE FAMILY DOMAIN. 
DR EMBL*, M22765', M22765. 
DR PIR? JT0350J JT0350. 
DR PR0SITE5 PS00061! ADH.SHORT. 
DR PROSITE? PS00342! MICRQBODIES_CTER. 

KW FATTY ACID METABOLISM; MULTIFUNCTIONAL ENZYME; 0X1D0REDUCTASE ! NAD; 

kw lyase; isomerase; peroxisome; duplication. 

FT DOMAIN 5 228 SHORT-CHAIN DEHYDROGENASE LIKE. 

PT DOMAIN 319 532 SHORT-CHAIN DEHYDROGENASE LIKE. 

FT SITE 904 906 MICROBODY TARGETING SIGNAL (POTENTIAL) . 

SQ SEQUENCE 906 AA, 99409 MW; 4146036 CN; 



CC 
CC 
CC 
CC 
CC 



Db 91 tvhviinnagi.lrdasnUkntekd 114 

:: |:::|| || ;; MM I 
Qy 2 ALPVVLENARILKNCVDAKMTEED 25 



RESULT 15 aa 
ID POLG_FMDVS STANDARD; PRT; 861 AA. 

AC P03311; 

DT 2WUL-1986 (REL. 01, CREATED) 

DT 01-N0V-1988 (REL. 09, LAST SEQUENCE UPDATE) 

nr m-nCT-1994 (REL. 30, LAST ANNOTATION UPDATE) 

GENOME POlJpRoIeIn ,COAT PROTEINS VP3, VP1, CORE PR OTE.N P53. PRO A E 
(PC 4 22 ->; RNA-DIRECTED RNA POLYMERASE (EC 2.7.7.48)) (FRAGMENTS). 

OS St-AND-MOUTH DISEASE VIRUS (STRAIN Cl-SANTA PAU CC-S81) (APHTHOVIRUS 

11 5'lRIDAE! SS-RNA N0NENVELOPED VIRUSES; PICORNAVIRIDAE; APHTHOVIRUSES. 

RN [11 

RP SEQUENCE OF 1-332 FROM N.A. 

RM 84005890 

RA VILLANUEVA N . , DAVILA M. , ORTIN J., DOMINGO E.; 

RL GENE 235 185-194(1983) . 

RN [21 

RP SEQUENCE OF 333-861 FROM N.A. 

RM 85286357 

RA MARTINEZ-SALAS E., ORTIN J., DOMINGO E. ! 

?r ^ E p?Mi 5 SPEclFi? 5 ENZYMATIC CLEAVAGES IN VIVO YIELD MATURE PROTEINS. 

CP -'- SUBUNIT; THE VIRUS CAPSID IS COMPOSED OF 60 ICOSAHEDRAL UNITS, 
CC ' EACH OF WHICH IS COMPOSED OF ONE COPY EACH OF PROTEINS VP1, VP*, 
CC VP3, AND VP4. 

DR EMBL; M110275 PIP61. 

DR FIR; A03913; A03913. 

KW POLYPROTEll! COaTpROTEIN; CORE PROTEIN; RNA-DIRECTED RNA POLYMERASE; 

KW HYDROLASE; THIOL PROTEASE . 

FT NON_TER 

FT CHAIN 

FT CHAIN 

FT CHAIN 

FT NON_CONS 

FT CHAIN 

FT CHAIN 

SG SEQUENCE 

DB 55 Score 63; Match 40.9%; Predicted No. 2.49e+00; 

Matches 9; Conservative 7; Mismatches 6? Indels 0; Gaps 

Db 441 vvldevif srhkgdtkmsaedk 462 

|||:: : " IMP III 
Qy 5 VVLENARILKNCVDAKMTEEDK 26 

Search completed: Fri Mar 24 07:43;00 1995 
Job time : 12 sees. 



1 


1 




<i 


46 


COAT PROTEIN VP3. 


47 


254 


COAT PROTEIN VP1. 


255 


332 


CORE PROTEIN P52. 


332 


333 




<333 


391 


PROTEASE. 


392 


861 


RNA-DEPENDENT RNA 


861 AA! 


95554 


MU; 3818070 CN! 



/ u \ 



0| |0 Intel I iGenetics 
> 0 < 

FastDB - Fast Pairwise Comparison of Sequences 
Release 5.4 



Results file 1-pat.res made fay on Fri 24 Mar 95 7:55;32-PST. 



Query sequence being compared? US.-08-3.Q0-510-1 ( 1-27 ) 
Number of sequences searched; 50375 
Number of scores above cutoff? 3935 

Results of the initial comparison of US-08-300-510-1 (1-27) with? 
Data bank ? As^neSe^ 17» all entries 

100000- 

N 

U50000- 

M 
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E - * 
R 
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0 

F1G000- 

# 

s - * 

E 5000- 
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U 

E - * 
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S 1000- 
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100- 
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# 

10- 
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SCORE 0 
STDEV 



t 1 

i 1 


t i 
1 i 




1 1 

1 1 


| 






1 P ! 


15 

J. w 


18 


3 


5 


7 


8 





21 24 27 



PARAMETERS 



Similarity matrix 
Mismatch penalty 
Gap penalty 
Gap size penalty 
Cutoff score 
Randomization group 



Unitary 
1 

1.00 
0.05 
0 
0 



Initial scores to save 
Optimized scores to save 



Scores ? 



40 
0 



K-tuple 

Joining penalty 
Window size 



Alignments to save 
Display context 



2 
20 
27 



15 
100 



SEARCH STATISTICS 

Mean Median 
2 3 



Times J 



CPU 
00:00:23, 



02 



Standard Deviation 
1.60 

Total Elapsed 
00:00:26.00 



Number of residues: 

Number of sequences searched: 

Number of scores above cutoff: 

Cut-off raised to 2. 
Cut-off raised to 3, 
Cut-off raised to 4. 
Cut-off raised to 5. 
Cut-off raised to 6. 



6065180 
50375 
3935 



The scores below are sorted by initial score. 
Significance is calculated based on initial score. 

2 1007. identical sequences to the query sequence were found! 



Sequence Name 


Description 


Ini t . 
Length Score 


Opt. 

Score 


Sig. Frame 


1. R41975 

2. R36542 


Human T cell 
Peptide X. 


reactive feline 27 27 

27 27 


27 
27 


15.60 0 
15.60 0 


9 100'/. similar 


sequences to the 


query sequence were found: 






Sequence Name 


Description 


Init . 
Length Score 


Opt a 
Score 


Sig. Frame 



R12120 
R27368 
R36548 
R27367 
R12119 

8. R36539 

9. R41983 

10. R36540 

11. R41984 



3. 
4. 
5. 
6. 
7. 



TRFP chain 1 with leader B. 
TRFP Chain #1 with CI leader 
Recombitope YZX. 
TRFP Chain 81 with Ci leader 
TRFP chain 1 with leader A. 
TRFP chain 1 (with Leader A). 
Human T cell reactive feline 
TRFP chain 1 (with Leader B) . 
Human T cell reactive feline 



96 


27 


27 


15.60 


0 


96 


27 


27 


15.60 


0 


96 


27 


27 


15.60 


0 


94 


27 


27 


15.60 


0 


94 


27 


27 


15.60 


0 


92 


27 


27 


15.60 


0 


92 


27 


27 


15.60 


0 


88 


27 


27 


15.60 


0 


88 


27 


27 


15.60 


0 



The list of other best scores is 



Init. Opt. 



sequence wane 



uescr ipuiuri 



Lengtn bcore Score big. Frame 



**** 4 standard deviations above mean 



12. 


R48678 


Insecticidal protoxin. 


1157 


10 


11 


4.99 


0 


13. 


R52578 


Glucanase of Hordeum vulgare 


334 


9 


9 


4.37 


0 


14. 


P93413 


Carbamate hydrolase. 


493 


9 


11 


4.37 


0 






3 standard deviations 


above nean 










15. 


R49554 


Corynebacter iun halohydrin ep 


244 


8 


10 


3.74 


0 


16. 


R28296 


Halohydrin epoxidase enzyne. 


244 


8 


10 


3.74 


0 


17. 


R03623 


Zucchini yellou nosaic virus 


283 


8 


10 


3.74 


0 


18. 


P92062 


Sequence of Isopenici 1 1 in N s 


333 


8 


9 


3.74 


0 


19. 


R45741 


Hyoinositol dehydrogenase. 


334 


8 


9 


3.74 


0 


20. 


R60654 


pstS variant. 


346 


6 


10 


3.74 


0 


21. 


R60653 


pstS variant. 


346 


8 


10 


3.74 


0 


22. 


R60652 


pstS variant. 


346 


8 


10 


3.74 


0 


23. 


R60651 


pstS variant. 


346 


8 


10 


3.74 


0 


24. 


R60650 


pstS variant. 


346 


8 


10 


3.74 


0 


25. 


R60649 


pstS variant. 


346 


8 


10 


3.74 


0 


26. 


R60648 


pstS variant. 


346 


8 


10 


3.74 


0 


27. 


R60647 


pstS variant. 


346 


8 


10 


3.74 


0 


28. 


R51473 


pstS gene product of E.coli. 


346 


8 


10 


3.74 


0 


29. 


R60646 


pstS variant. 


346 


8 


10 


3.74 


0 


30. 


R60645 


pstS variant. 


346 


8 


10 


3.74 


0 


31. 


R60644 


pstS variant. 


346 


8 


10 


3.74 


0 


32. 


R60643 


pstS variant. 


346 


8 


10 


3.74 


0 


33. 


R60642 


pstS variant. 


346 


8 


10 


3.74 


0 


34. 


R60641 


pstS variant. 


346 


8 


10 


3.74 


0 


35. 


R60640 


pstS variant. 


346 


8 


10 


3.74 


0 


36. 


P82053 


Outer fienbrane protein F of P 


350 


8 


8 


3.74 


0 


37. 


R42064 


Endogtucanase enzycie. 


376 


8 


9 


3.74 


0 


38. 


R37151 


Dye transfer inhibiting conps 


376 


8 


9 


3.74 


0 


39. 


R27969 


Endoglucanase enzycie. 


376 


8 


9 


3.74 


0 


40. 


R25429 


Cellulase contained in a dete 


376 


8 


9 


3.74 


0 



1. US-08-30Q-51G-1 (1-27) 

R41975 Hunan T cell reactive feline protein fragment X. 

ID R41975 standard! peptide? 27 AA, 

AC R41975? 

DT 21-APR-1994 (first entry) 

DE Hunan T cell reactive feline protein fragment X. 

KW Hunan? T cell? reactive? feline; protein; immune response; antigen; 

KW tolerance? mammal? Dermatophagoides ? Felis; Ambrosia; Lolium? Canis; 

KW Cryptomeria? Alternaria? Alder? Betula? Quercus? Olea; Artemesia? 

KW Plantago? Parietaria? Blattella; Apis; Periplaneta? autoantigen; ss. 

OS Homo sapiens. 

PN W09319178-A. 

PD 30-SEP-1993. 

PF 25-MAR-1993; U02462. 

PR 25-MAR-I992; US-857311. 

PR 15-(1AY-I992? US-884718, 

PR 15-JAN-1993? US-006116, 

PA (IMHU-) IMMUNOLOGIC PHARM CORP. 

PI driver TJ, Garman RDr Gefter Mir Greenstein JL? 

PI Kuo Mr Morville M; 

DR HPI? 93-320744/40. 

PT New peptide(s) for inducing tolerance - comprise one or more 

PT epitope(s) of an allergen administered subcutaneous ly , for 

PT treating sensitivity to catsr beesr etc. 

PS Claim i; Fig 3? 107pp? English. 

CC The sequences given in R41975-82 are peptides derived from a human T 

CC cell reactive feline protein* These peptides are used in a 

CC therapeutic composition yhich is useful in treating diseases uhich 

CC involve an immune response to a protein antigen. This composition 

CC may be used to induce tolerance in a mammal to Dermatophagoides t 



u., rensi Hf>ioto5ia» loiiuri i,rypic,omer ia* Hiiernariar Hioer» tecuiar 

CC Quercusr Oleai Artemesiar Plantagor Parietariar CaniSr Blattella» 

CC Apisr Periplaneta and to autoantigens in humans. 

SQ Sequence 27 AA; 

ss 2 a; l R; 0 N; 3 D; 0 B; 0 C; 2 Qf2 e;o Z; l G; 0 H; 

SQ 0 i; 3 L;2 K; 0 M; l F; 2 P;0 S;2 T; 0 W; 2 Y; 4 v; 



Initial Score = 
Residue Identity = 
Gaps = 



27 Optimized Score = 27 Significance = 15.60 
100% Matches = 27 Mismatches = 0 

0 Conservative Substitutions = 0 



X 10 20 X 

KRDVDLFLTGTPDEYVEGVAQYKALPV 

I! II MM MM! Mill I M I II Ml 
KRDVDLFLTGTPDEYVEGVAQYKALPV 
X 10 20 X 



US-08-300-510-1 (1-27) 
R36542 Peptide X. 



ID 
AC 
DT 
DE 
KW 
OS 
PN 
PD 
PF 
PR 
PR 
PA 
PI 
PI 
DR 
PT 
PT 
PT 
PS 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
SQ 
SQ 
SQ 



R36542 standard; Protein; 27 AA. 
R36542; 

12- AUG-1993 (first entry) 
Peptide X. 

Hunan T cell reactive feline protein; TRFP; epitope? recombitope. 
Felis. 

W09308280-A. 
29-APR-1993. 
16-0CT-1992; UG8694. 
16-0CT-1991; US-777859. 

13- DEC-1991; US-807529. 
UMMU-) IMMULOGIC PHARM CORP. 

Bond JFi Carman RDr Kuo Mi Morgenstern JPf Morville M; 

Rogers BL; 

WPI; 93-152473/18. 

Recombitope peptide having T-cell stimulating activity - for the 
diagnosis and treatment of sensitivity to protein allergenst 
auto; antigens and protein antigens 
Disclosure; Fig 4; 73pp; English. 

Chains 1 and 2 of the TRFP have been recombinantly expressed in E. 
coli and purified. T cell epitope studies using overlapping peptide 
regions derived from the TRFP amino acids sequence were used to 
identify multiple T cell epitopes in each chain of TRFP. DNA 
constructs were assembled in which 3 regions (encoding peptides Xr 
Y and Z) were linked to produce DNA constructs encoding recombitope- 
peptides . 
Sequence 27 AA; 

2 A;l R; 0 N;3 D; 0 B; 0 C; 2 Q;2 E; 0 Z; 1 G; 0 H; 
0 i;3 l;2 k;o m;i f;2 P;o s;2 T;o w;2 y;4 v; 



Initial Score = 27 
Residue Identity = 1007. 
Gaps = 0 



Optimized Score = 27 
Matches = 27 

Conservative Substitutions 



Significance 
Mismatches 



15.60 
0 
0 



X 10 20 X 

KRDVDLFLTGTPDEYVEGVAQYKALPV 

MIMMMMIIIMM MINI Ml 
KRDVDLFLTGTPDEYVEGVAQYKALPV 
X 10 20 X 



3. US-08-300-510-1 (1-27) 

R12120 TRFP chain 1 uith leader B. 



ID R12120 standard; Protein; 96 AA. 



DT 26-JUL-1991 (first entry) 

DE TRFP chain I with leader B. 

KW Human T cell reactive feline protein? cat allergens. 

OS Felis catus. 

PH Key Location/Qualifiers 

FT Peptide 9. .26 

FT /label= Leader B 

FT Protein 27.. 96 

FT /label= TRFP Chain 1 

FN W09106571-A. 

PD 16-MAY-1991. 

PF 02-NQV-1990? U06548. 

PR 03-NQV-1989; US-431565. 

PA (IMMU-> IHMULOGIC PHARM COR. 

PI Gefter HLr Garman RD * Greenstein JL» Juo Mi Rogers BL; 

PI Brauer AW; 

DR UPI; 91*164136/22. 

DR N-PSDB? Q11837. 

PT New pure covalently linked human T cell reactive feline protein - 

PT and modified peptide(s)r used to reduce effects of cat allergens 

PT and to diagnose sensitivity to allergens. 

PS Claim 2; Fig 1? 70pp? English. 

CC Poly-A mRNA from cat parotid and mandibular glands uas used to 

CC produce cDNA clones for both chain 1 and chain 2 of TRFP. These 

CC clones were then used to screen a cat genomic library. Chain 1 

CC exists in two forms having different leader sequences (A and B) . 

CC The sequence can be used to express the protein and peptide derivs. 

CC which stimulate T-cells in persons allergic to cats. The peptides 

CC can be used to reduce/eliminate the allergic response partic. by 

CC modificn. of lynphokine prodn. by the T-cells. They can also be 

CC used to identify epitopes responsible for sensitivity. The DNA can 

CC be used to detect comparable sequence in other species, and also 

CC for prodn. of modified forms of TRFP esp. showing reduced binding 

CC to lg£ and thus reduced tendency to cause adverse reactions. 

CC See also R121 19-R12123. 

50 Sequence 96 AA? 

sq 12 a; 4 r; 3 n; 8 D; o B? 6 c; 2 a; 7 E? o z? i G; o H? 

ss 3 i; il l; 7 K? 2 n; l F? 7 P? 3 s; 6 T? 2 u; 3 Y; 8 v; 

Initial Score = 27 Optimized Score = 27 Significance = 15.60 
Residue Identity = 100X Hatches = 27 Mismatches = 0 

Q a p 5 = 0 Conservative Substitutions = 0 

X 10 20 X 

KRDVDLFLTGTPDEYVEGVAQYKALPV 



AWRCSWKRMLDAALPPCPTVAATADCEICPAVKRDVDLFLTGTPDEYVEQVAQYKALPVVLENARILKNCVD 
10 20 30 X 40 50 60 70 

AKMTEEDKENALSLLDKIYTSPLC 
80 90 



4. US-08-300-510-1 (1-27) 

R27368 TRFP Chain #1 with CI leader B sequence. 

ID R27368 standard; protein? 96 AA. 

AC R27368; 

DT 25-FEB-1993 (first entry) 

DE TRFP Chain Si with CI leader B sequence. 

KfeJ T cell reactive feline protein; cat allergy? allergic? IgE? 

KU desensitizing? 

OS Felis domesticus. 

FH Key Location/Qualifiers 

FT Peptide 1..27 



FT Protein 26.. 96 

FT /label= TRFP chain #1 

PN W09215613-A. 

PD 17-SEP-1992. 

PF 20-FEB-1992, U01344. 

PR 28-FEB-1991? US-662193. 

PA < IMHU-) ItlMULOGIC PHARM CORP. 

PI Bond Jr Kuo Ml 

?! ^^l^^eU reactive feline protein - plates T-cell 

PT in individuals allergic to cats and shoas reduced 

PT histanine-releasing properties 

'A ^^i!^^u F en^e 1 ;e 3 p^:nt: 9l a i li^e, hunan T-cell reactive feline 

Sc P role n^ ch striates T-c.ll. fro* an individual who , . ergic 

CC L cats but U hicb interacts ? iih hu„an Ig o j .... -tent^ nan 

CC Srtn : n n er ty a ^Ud a pH " . KOH. NaOH, LiOH or tertiary 

CC la e, „« «hich removes 0-linUed groups (carbohydrate 

CC Pieties). It is useful in desensitising people uho are allergy to cats 

ti r 2 q rr R; 9 3 A s; e . E? . c: 2 7 * 0 * ;! ; «, 

S Q 3 IJ U LJ 7 K( 2 MJ 1 FJ 7 P! 3 Si 6 T, 2 W, 3 Y, 7 V, 

initial Score = 27 Optimized Score = 27 Significance - 15.60 
Residue Identity = 100% Matches 7.. * 7 MlsnatcheS _ ° 0 

Gaps = o Conservative Substitutions 

X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 

mil mm! nun minimi 

AWRCSUKRMLDAALPPCPTBAATADCEICPAVKRDVDLFLTGTPDEYVEQVAQYKALPVVLENARILKNCVD 
10 20 30 X 40 50 60 /« 

AKMTEEDKENALSLLDKI YTSPLC 
80 90 



. US-08-300-510-1 (1-27) 

R3654e Reconbitope YZX. 

ID R36548 standard; Protein', 96 AA. 

AC R36548J 

DT 12-AUG-1993 (first entry) 

11 Sr.« b "STl"«ii.. Mil.. p™t.l». T RFPi .pit*.. r 9 c™biUp* 

KW sensitivity* Felis donesticus. 

OS Synthetic. 

p H K ey Location/Qualifiers 

FT Cleavage_site 14.. 15 

FT /label= thronbin_cleavage_site 

PN WQ9308280-A. 

PD 29-APR-1993. 

PF 16-0CT-1992; U08694. 

PR 16-0CT-19915 US-777859. 

PR 13-DEC-1991J US-807529. 

PA { IMMU-) ItlMULOGIC PHARM CORP. 

PI Bond JF, Garnan RD r Kuo H, Horgenstern JP, Morville Mr 

PI Rogers BL'r 

DR WPIr 93-152473/18. 

DR N-PSDBr Q41572. A . fW _ 

PT Reconbitope peptide having T-cell stimulating activity - for the 

PT diagnosis and freatnent of sensitivity to protein allergens. 

PT auto? antigens and protein antigens 

PS Disclosure? Fig 85 73ppi English. 



cc 
cc 
cc 
cc 
cc 

SQ 
5Q 
SQ 



^reter-rea t-econbitope pept Ws tor treating sensitivity to J-etis 
domesticus are derived from the the genus FeUs and comprise 
regions selected from peptides X, Yr Z, A and Br of TRFP, and 
modifications thereof, such as peptide C 
Oligonucleotides C, D, E, F, G. H and I are used in the 
construction of recombitope peptide YZX. 
Sequence 96 AAi 

8 a; 4 R; 5 Nf 6 d; o b; l c; 2 fi; 10 E; 0 z; 4 g. 6 H, 
i i; 12 l; 7 k; 2 m; 4 f ; 5 P; 2 s; 5 t; o w; 2 Y; 10 v; 



Initial Score 
Residue Identity 
Gaps 



27 Optimized Score = 27 
1007. Matches = 27 

0 Conservative Substitutions 



Significance 
Mismatches 



15.60 
0 
0 



KRD 
Ml 

MGHHHHHHEFLVPRGSKALPVVLENARILKNCVDAKMTEEDKEFFAVANGNELLLDLSLTKVNATEPERKRD 

30 40 50 60 70 



10 



20 



10 20 X 

VDLFLTGTPDEYVEQVAQYKALPV 

miiMiiiimiiiiimii 

VDLFLTGTPDEYVEQVAQYKALPV 
80 90 X 



US-08-300-510-1 (1-27) 

R27367 TRFP Chain #1 with CI leader A sequence, 



ID 

AC 

DT 

DE 

KW 

OS 

FH 

FT 

FT 

FT 

FT 

PN 

PD 

PF 

PR 

PA 

PI 

DR 

PT 

PT 

PT 

PS 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

SQ 

SQ 

SQ 



R27367 standard; protein; 94 AA. 
R27367J 

25-FEB-1993 (first entry) 

TRFP Chain 81 with CI leader A sequence, 

T cell reactive feline protein. 

Felis domesticus. 

Key Location/Qualifiers 
Peptide 1..25 
/label= Leader A 
Protein 25.. 94 

/label= TRFP chain #1 
W09215613-A. 
17-SEP-1992. 
20-FEB-1992J 
28-FEB-1991; 
(IHtlU-) 
Bond J> 



U01344. 
US-662193. 
IMKULOGIC PHARH CORP. 
Kuo m; 



-cell 



WPI; 92-331670/40. 

Modified human T-cell reactive feline protein - stimulates T- 
in individuals allergic to cats and shows reduced 
histamine-releasing properties 
Claim l; Fig If 35pp! English. 

This sequence represents a modified human T-cell reactive feline 
protein which stimulates T-cells from an individual who is allergic 
to cats, but which interacts with human IgE to a lesser extent than 
does affinity purified TRFP. The protein is modified by treating 
with either a mild alkali <pH 12.5-13.5 . K0H, NaOHr LiOH or tertiary 
amines) or an enzyme which removes 0-1 inked groups (carbohydrate 
moieties). It is useful in desensitising people who are allergic to cats 

Sequence 94 AA; ..„„„, 
9 A; 3 r; 4 n; 6 D; 0 B; 5 C; 2 Q; 7 E; 0 Z; 4 G; o H, 
5 I; 15 l; 7 K; 2 Hi l F; 4 P; 2 S; 4 T; 2 w; 3 Y ; v; 



Initial Score 
Residue Identity 
Gaps 



27 Optimized Score = 27 
100% Matches = 27 

0 Conservative Substitutions 



Significance 
Mismatches 



15.60 
0 
0 



X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 



CIHKGARVLVLLMAALLLIHGGNCEICPAVKRDVDLFLTGTPDEYVEQVAQYKALPVVLENARILKNCVDAK 

20 30 40 50 * 6U 



10 



MTEEDKENALSLLDKIYTSPLC 
80 *?0 



7. US-08-300-510-1 (1-27) 

R12119 TRFP chain 1 with leader A. 



ID 



R12119 standard; Protein; 94 AA. 

AC R12119; 

DT 26-JUL-1991 (first entry) 

DE TRFP chain 1 with leader A. 

KW Hunan T cell reactive feline protein; cat allergens. 

OS Felis catus. 

FH Key Location/Qualifiers 

FT Peptide 3. .24 

FT /label= Leader B 

FT Protein 25, -94 

FT /label= TRFP Chain 1 

PN W09106571-A, 

PB 16-MAY-1991. 

PF 02-N0V-1990; U0654B. 

PR 03-N0V-1989; US-431565. 

PA ( IMMU-) ItWULQGIC PHARM COR. 

PI Gefter ML r Garwan RD, Greenstein JLr Juo H, Rogers 

PI Brauer AW; 

BR WPI; 91-164136/22. 

M N^pure^ovalently link.d hu.an T cell reactive feline protein - 
PT and Todtfied peptideCs). used to reduce effects of cat allergens 

PI and to diagnose sensitivity to allergens, 
pq Claim 2; Fig 1; 70pp? English. 

CC P -A ,RNA 9 fron cat parotid and mandibular glands .as use 
CC produce cDNA clones for both chain 1 and chain 2 of TRFP. The e 
CC Clones uere then used to screen a cat genome library. Chain 1 
CC exists in two forns having different leader sequences (A and B) . 
PC Tn quen e can be used to express the protein and P*^** J^s 
W which stimulate T-cells in persons allergic to cats. The peptides 
CC can be used to reduce/el ininate the allergic response part c. by 
ZC nodiflcn. of lynphokine prodn. by the T-cells. They ""J 1 "** 
rc useS o identify epitopes responsible for sensitivity. The DNA can 
,'c be used to detect comparable sequence in other species, 
CC for prodn. of modified forns of TRFP esp. shoeing reduced binding 
CC to IgE and thus reduced tendency to cause adverse reactions. 
CC See also R12120-R12123. 

SQ Sequence 94 AA! r , n 

SQ 9 Al 3 R! 4 N> 6 01 0 B> 5 CI 2 8 7 E 0 Z, 4 G, 0 
SQ 5 i; 15 Lf 7 k; 2 m l F; 4 P; 2 s; 4 T; 2 W, 3 Y, 9 

iniuai ri t = fi; atcir score : s ^iie:" : l5 - b i 

Residue Identity - 100/, Hatches 

Gaps = 0 Conservative Substitutions 

X 10 20 X 

KRDVDLFLTGTPBEYVEQVAOYKALPV 

1 I I 1 1 1 I 1 t I 1 1 I 1 i I ^ 
TPDEYVEGV 

10 20 30 40 



h; 
v; 



M | I I I I I I I 1 M | I I t 1 M I 1 M I f t 
CIMKGARVLVLLHAALLLIWGGNCEICPAVKRDVDLFLTGTPDEYVEQVAQYKALPVVLENAR1LKNCVDAK 

in tn 40 50 X ou '« 



HTEEDKENALSLLDK1YTSPLC 



8. US-08-300-510-1 (1-27) 

R36539 TRFP chain 1 (with Leader A). 



ID R36539 standard? Protein; 92 AA. 

AC R36539? 

DT 12-AUG-1993 (first entry) 

DE TRFP chain i (with Leader A). 

KU Hufan T cell reactive feline protein; TRFP; leader A 5 leader B, 

KW epitope. 

OS Felis. 

FH Location/Qualifiers 

FT Peptide 1..22 

FT /label- leader. peptide 

FN W093082S0-A, 

PD 29-APR-1993. 

PF 16-QCT-1992I U08694. 

PR 16-0CT-1991 ? US-777859. 

PR 13-DEC-1991; US-807529. 

PA (IHHU-) IHHULOGIC PHARM CORP. ... M , 

PI Bond JF, Garnan RDr Kuo Mr Horgenstern JP. Morville 

PI Rogers BLl 

DR HPI5 93-152473/18. 

?? R^lto^peptide having T-cell .tiBul.ting .cil vity - for the 

PT diagnosis and treatment of sensitivity to protein allergens, 

PT auto'.antigens and protein antigens 

PS Disclosure; Fig I; 73pp; English. . 

cr Chains 1 and 2 of the TRFP have been reconbinantly expressed in E. 

CC coU and purified. T cell epitope studies using overlapping peptide 

CC regions derived fro* the TRFP anino acids sequence u-ere used to 

CC identify multiple T cell epitopes in each chain of TRFP. 

SQ Sequence 92 AA; , . u , 

sq 9 a; 3 r; 4 n; 6 di o b; 4 c; 2 a; 7 e; o z c. o h. 
sq 4 i; 15 l; 7 k; 2 hi l fj 4 p; 2 s; 4 t; 2 w; 3 y, 9 v, 

initial Score = 27 Optimized Score = 27 Significance = 15.60 
Residue Identity = 1007. Matches = 27 Mismatches _ 0 

Gaps = 0 Conservative Substitutions 

X 10 20 X 

KRDVDLFLTGTPDEYVESVAQYKALPV 

MKGARVLVLLWAALLLIWGGNCEICPAVKRDVDLFLTGT^ 

10 20 30 40 50 X 60 /" 

EEDKENALSLLDKI YTSPLC 
80 90 



K US-08-300-510-1 (1-27) . 
R41983 Hunan T cell reactive feline protein A chain 1. 

ID R41983 standard; Protein; 92 AA. 

AC R41983; 

DT 21-APR-1994 (first entry) 

DE Hunan T cell reactive feline protein A chain 1. 

KW Hunan; T cell; reactive; feline; protein; innune response, antigen. 

KU tolerance; nazals Dernatophagoides ; Felis; Anbros a. Loliun, Cams 

KW Cruptoneria; Alternaria; Alder; Betula; Quercus; Olea, Artenesia, 

KW Plantago; Parietaria; Blattella; Apis; Periplaneta; autoantigen. 

OS Horo sapiens. 

Key Location/Qualifiers 

FT Peptide 1..22 



7T TTiOLe- oiynat pe^io* 

FT Protein 23.. 92 

FT /note= "Mature protein" 

PN W09319178-A. 

PD 30-SEP-1993. 

PF 25-MAR-1993J U02462. 

PR 25-MAR-1992J US-S57311. 

PR 15-MAY-1992; US-884718. 

PR 15-JAN-1993! US-006U6. 

PA (IMMU-) IMMUNOLOGIC PHARM CORP. 

PI Briner TJ, Garwn RD. Gefter ML, Greenstein JL, Kuo M, 

PI Morville M', 

DR WPI; 93-320744/40. 

°P R T S« w«- 9 "?-ror induclm tolera.ee - comprise one or .ore 

1] epitope^ of an allergen administered subcutaneously . for 

PT treating sensitivity to cats- bees, etc. 

P C S C ^^X^^^r,^ Cain 1 of hunan T cell 

CC r^actlveTeUne proteins (TRFP) A and B respectively. Peptides 

CC derived fro, TRFP nay be used in a therapeutic conpontion yhich 

CC AUernaria, Alder, Betula, Quercus, Olea, Artenesia, Plantago, 

CC fl^UHa! Canis, Blattella, Apis, Periplaneta and to autoantigens 

CC in hunans. 

SQ Sequence 92 AA ; Q; 0 H; 

n i !:s v,\ v,\ v,\ i;i ?;J „, „ 

Residue Identity = 100/. Matches , ... = 0 

= 0 Conservative Substitutions 



is 

a 



Gaps 



X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 



liKGARVLVLLWAALLLIUGGNCEICPAVKRBVOLFLTCTPDEYVEQVAQYKALPVVLEHARILKMCVDAKtIT 



EEDKENALSLLDKI YTSPLC 
80 90 



10. US-08-300-510-1 (1-27) 

R36540 TRFP chain 1 (uith Leader B) . 



ID 



R36540 standard; Protein! 88 AA. 

AC R36540J 

DT 12-AUG-1993 (first entry) 

KW epitope. 

OS Feiis. 

FH Key Location/Qualifiers 

FT Peptide 1..18 

FT /label* leader .peptide 

PN W09308280-A. 

PD 29-APR-1993, 

PF 16-0CT-1992J U08694. 

PR 16-0CT-199H US-777859. 

PR 13-DEC-1991? US-807529. 

PA (IHMU-) IHtiULOGIC PHARtf CORP. „ iU _ „■ 

PI Bond JF, Garnan RD, Kuo Horgenstern JP. MorvilleH, 

PI Rogers BU 



PT diagnosis and treatment of sensitivity to protein allergens, 
PT auto: antigens and protein antigens 
Disclosure; Fig l! 73pp; English. 

Chains 1 and 2 of the TRFP have been recombinant^ expressed in E. 
To and purified. T cell epitope studies using overlapping peptide 
regions derived fro* the TRFP amino acids sequence were used to 
identify multiple T cell epitopes in each chain of TRFP. 

SQ Sequence 88 AA! ,„„,<-. a 7 . i r- a 

SQ 11 A5 2 Rl 3 NI 8 D; 0 Bl 5 C» 2 81 7 E 0 Z. 1 G, 0 
SQ 3 II U Ll 6 Kl 2 Mi 1 Fl 7 PI 2 Si 6 Tl 0 W, 3 Y. 8 

Residue Identity = 100* Hatches , .. A .. - o 

0 Conservative Substitutions 



PS 
CC 
CC 
CC 
CC 



Hi 
VI 



Gaps 



X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 



MLDAALPPCPTVAATADCEICPAVKRDVDLFLTGTPDEYVE8VA8YKAL 

10 20 X 30 40 50 60 /« 

ENALSLLDKIYTSPLC 
80 



11. US-08-300-510-1 (1-27) . . 

R41984 Hunan T cell reactive feline protein B chain 1. 

ID R41984 standard; Protein; 88 AA. 

AC R41984; 

DT 21-APR-1994 (first entry) 

DE Hunan T cell reactive feline protein B chain I. 

KW Lan; T cell; reactive; feline; protein; immune response; ant gen, 

KU tolerance; mammal! Dermatophagoides ; Felis; Ambrosia; Loliun, tanis, 

KM Cryptomeriai Alternarial Aider! Betula! Quercusi Olea; Artemesia. 

KU Plantago; Parietaria; Blattella; Apis; Periplaneta; autoantigen. 

OS Homo sapiens. 

p H « ey Location/Qualifiers 

FT Peptide 1..17 

FT /note= "Signal peptide" 

FT Protein 18.. 88 

FT /note= "Mature protein" 

PN W09319178-A. 

PD 30-SEP-1993. 

PF 25-MAR-1993; U02462. 

PR 25-MAR-1992; US-857311. 

PR 15-HAY-1992; US-884718. 

PR 15-JAN-1993; US-006U6. 

PA (IMMU-) IMMUNOLOGIC PHARM CORP. 

PI Briner TJ, Garman RD, Gefter ML, Greenstein JL, Kuo M, 

PI Morville M; 

DR HPT, 93-320744/40. 

DR N-PSDBJ Q49534. ^ a 

PT New peptide(s) for inducing tolerance - comprise one or more 

PT epitope(s) of an allergen administered subcutaneously , for 

PT treating sensitivity to cats, bees, etc. 

PS Disclosure; Fig 1; 107pp; English. 

CC The sequences given in R41983-84 represent chain 1 of huP,an .' " U 
CC reactive feline proteins (TRFP) A and B respectively. P«Ptid« . 
CC de v d fro. TRFP nay be used in a therapeutic composition which is 
CC useful in treating diseases uhich involve an immune response to a 
CC protein antigen. This composition may be used to induce tolerance 
CC in a mamma to Dermatophagoides , Felis, Ambrosia, Lolium, Cryptomeria 



TC ffix.ert-.ana, fttoer, wuta, tfuercus. utea, Hrt-emes i a , .►tantago, 

CC Parietaria, Canis, Blattella. Apis, Periplaneta and to autoantigens 
CC in hunans. 

SO Sequence 88 AA » H . 

sq ii as a ri 3 n; 8 d; o B; 5 c; 2 • ; 7 E o z, 1 b. 0 H, 

SQ 3 i; u l; 6 k; 2 m; l F; 7 P; 2 ss 6 T; w, 3 

Initial Score - 27 Optimized Score - 27 Significance - 15.60 

Residue Identity = 100% Matches =. tt . 27 Mlsnatches . ° 

Gaps = 0 Conservative Substitutions - u 

X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 

HLDAALPPCPTVAATADCEICPAVKRDVDLFLTGTPDEYVEQVAQ^ 

10 20 X 30 40 50 60 70 

ENALSLLDKIYTSPLC 
80 



12. US-08-300-510-1 (1-27) 

R48678 Insecticidal protoxin. 

ID R48678 standard; Protein; 1157 AA. 

AC R48678; 

DT 13-0CT-1994 (first entry) 

DE Insecticidal protoxin. 

KW Insecticide; toxin; protoxin; Upidoptera; pest cont ol , ^ 

KW Bacillus thuringiensis; crop protection; crystal protein, 

KU delta endotoxin. 

OS Bacillus thuringiensis (BTS02618A) . 

PN WQ9405771-A. 

PD 17-HAR-1994. 

PF 12-JUL-1993; E01820. 

PR 27-AUG-1992; EP-402358. 

PR 09-APR-1993; EP-400949. 

PA (PLBZ > PLANT GENETIC SYSTEMS NV. 

PI Jansens S, Lanbert B, Peferoen H. Van Audenhove K; 

DR WPIJ 94-101176/12. 

DR N-PSDB! Q56782. , . . 

PT Bacillus thuringiensis strains producing insecticidal proteins 
PT active against Lepidoptera species - to produce transgenic 

PT plants resistant to Lepidoptera species 
PS Claim 3; Page 39-44; 49pp; English. 

CC The DNA encoding the protoxin can be used to transform a plant to 
CC protect the plant from Lepidopteran pests. The protoxin produced 
CC yields a tox'n product after trypsin digestion. The protoxin, toxin, 
PC crystal proteins and the Bacillus strain producing then can all be 
CC used as the active ingredient in an insecticidal conpos ltion. 

II fuf" r,"? 7 " 75 d; 0 b; 12 c; 59 a, 66 e; o z; 80 g; 22 H; 

It 50 U lotll 24 K; 13 W J 48 F J 46 P. 83 81 89 T; 15 K. 54 Y; 88 V; 

initial Score = 10 Optimized Score = U Significance = 4.99 

Residue Identity = 46% Hatches =. 13 Mlsnatches _ l l 

Gaps = 2 Conservative Substitutions - u 

NPGVDGTNRIESTAVDFRSALIGIYGVNRASFVPGGLFNGTTSPANGGCRDLYDTNDELPPDESTGSSTHRL 
410 420 430 440 450 460 470 

X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKA-LPV 

SHVTFFSFBTNSAGSIANACSVPTYVWTRRDVDLN^ 

480 490 500 510 520 530 540 




TTTGPFNPPFTF 
620 



13. US-08-300-510-1 (1-27) 

R52578 Glucanase of Hordeum vulgare L , 

ID R52578 standard! Protein; 334 AA. 
AC R52578! 

DT 05-DEC-1994 (first entry) 



DE Glucanase of Hordeum vulgare L. 

KW Antifungal; pathogen; resistance; transgenic organism, synergy, 

KW crop protection; transgenic plant; chitinase; glucanase, 

KW protein synthesis inhibitor; disease. 

OS Hordeum vulgare L. 

PN DE4234131-A. 

PD 21-APR-1994. 

PF 09-0CT-1992J 234131. 

PR 09-0CT-1992; DE-234131. 

PA (PLAC ) MAX PLAMCK GES FOERDERUNG WISSENSCHAFTEN. 

PI Chet I, Eckes P, Gornhardt B, Jach G, .Logemann J; 

PI Mundy J, Schellj, Goernhardt B; 

DR WPI! 94-136599/17. 

J? ?ransgenic 2 "ganisns contg. at least 2 pathogen inhibiting genes 

PT - esp. plants contg. genes with antifungal activity, show 

PT synergistic increase in disease resistance, also new DMA transfer 

PT vectors 

pc ExaflDle 2? Page 15-16? 19ppJ Gernan. 

CC Glucanase is an enzyme which breaks down glucan, a glucose poller 
present in fungal cell walls. The sequence encoding the glucanase 



CC 



CC enzyme nay be used in the construction of transgenic organises, 

CC especially plants, to produce pathogen resistant organise. The 

CC genome of such transgenic organisms preferably contains -or. than 

CC one gene with pathogen inhibiting activity, each f«« «« d ^ 

CC control of active promoters. The two gene products then show a 

CC synergistic increase in pathogen induced activity so that the 

CC transgenic organisms have a greater degree of resistance or 

CC resistance against a wider spectrum of diseases. 

SQ Sequence 334 AA'» ,, , u • 

SQ 50 A; 14 R; 28 N; 13 D; 0 BJ 1 C; 12 0; 8 E; 0 Z, 31 G, 1 H 

SQ 19 IS 22 l; 10 K; 6 h; 16 f; 16 p; 27 s; 16 t; 2 w; 16 Y, 26 V, 

initial Score = 9 Optimized Score = 9 Significance = 4.37 

Residue Identity = 337. Matches = 9 Mismatches - 18 

Gaps = 0 Conservative Substitutions 

NEVQGGATSSILPAMRNLNAALSAAGLGAIKVSTSIRFDEVANSFPPSAGVFNNAYITDVARLLASTGAPLL 
130 140 150 160 170 180 190 

X 10 20 X 

KRDVDLFLTGTPDEYVEQVASYKALPV 

ANVYPYFAYRDNPGSISLNYATFQPGTTVRDQNNGLTYTSLFDAMVDAVYAALEKAGAPAVKVVVSESGHPS 

200 210 220 230 240 250 260 

AGGFAASAGNARTYNQGLINHVGGGTPKKREALETYIFAMFNENSKTGDATERSFGLFNPDKSPAYMIQF 
270 280 290 300 310 320 330 



14. US-08-300-510-1 (1-27) 

P93413 Carbamate hydrolase. 



ID P93413 standard; protein? 493 AA. 

AC P93413? 

DT 27-APR-1990 (first entry) 

DE Carbamate hydrolase. ? 
KW Carbamate hydrolase? Arthrobacter oxidans? phennediphanw 
KW methyl 3-hydroKyphenylcarbartate. 
OS Arthrobacter oxidans DSM 4044. 
PN EP-343100-A. 
PD 23-NQV-1989. 
PF 17-t1ay~1989; 730123. 
PR 19-MAY-1988; DE-381738. 
PA (SCHD) Schering AG- 
PI Pohlenz HD r Boidol WJ 
OR WPIS 89-341858/47. 

DR N-PSDB; N92585. a . . . . . 

PT Pure carbamate hydrolase isolation from Arthrobacter oxidans - able to 

PT destroy herbicide phenmedipham , and DMA encoding itr for imparting 

PT resistance to plants. 

pc Disclosure? Fig. 7? 17pp? german. . 

CC PuriHed carbamate hydrolase can be used to isolate/identify A.jxid.n. 

CC carbamate hydrolase gene system. This system makes plants resistant £ 

CC the herbicide phenmedipham. Carbamate hydrolase has pH °P*?"»" 

CC uit. 50-60UD, isoelectric point 6.2- and can cleave phenmed pham to methyl 

CC 3-hydroxyphenylcarbamate, m-toluidine and C02, so motivating 

CC it. It is produced by A.oKidans DSM 4044 which contains the 41 kb plasmid 

CC pHP52. 

SQ Sequence 493 AA; ^ „. „ n , , fl! 9t P: ft z . 5 0 G; 19 H; 

SQ 
SQ 



m a; 33 p; il n; 41 D; o B; 3 C; 15 a; 24 E; 0 z; 50 G; 
il U 40 [I 4 K; 4 Hi 21 Fl 39 Pi 20 81 30 Tl 14 Ul 10 Yl 40 V; 



Initial Score = 9 Optimized Score = 11 Significance = 4.37 

Residue Identity = 327. Matches = 13 Mismatches - 14 

Gaps = 13 Conservative Substitutions - " 

VPYAEPPVGDLRWRAARPHAGWTGVRDASAYGPSAPQPVEPGGSPILGTHGDPPFDEDCLTLNLWTPNLDGG 
30 40 50 60 70 80 90 

x 10 20 X 

KRDVDL FLTGTPDEYVEQVAQYKALPV 

|| || II I M I I II 

SRPVLVWIHGGGLLTGSGNLPNYATDTFARDGDLVGISINYRLGPLGFLAGMGDENVWLTDQVEALRWI ADN 

100 110 120 130 140 150 160 170 

VAAFGGDPMRITLVGQSGGAYSIAALAQHPVARQLFHRAILSSPPFGMQPHTVEESTARTKALARHLGHDDI 
180 190 200 210 220 230 240 

EALRHEPWERLIQGT IGVLMEHTK 
250 260 



15. US-08-300-510-1 (1-27) 

R49554 Corynebacterium halohydrin epoxidase encoded by pi 

ID R49554 standard; Protein; 244 AA. 

AC R49554J 

DT 07-JUL-1994 (first entry) 

DE Corynebacterium halohydrin epoxidase encoded by plasmid pSTOlS. 

KU 3-hydroxynitrile; halohydrin epoxidase gene; recombinant plasmid, 

KW Escherichia coli. 

OS Corynebacterium sp. (strain N-1074) . 

PN J05317066-A. 

PD 03-DEC-1993. 

PF 04-MAR-1991I 062597. 

PR 04-MAR-1991; JP-062597. 

PA (MITT ) NITTO CHEIi IND CO LTD. 



I-H v t HflH / I Hi Pi H U H ri. 

DR HPIJ 94-011029/02. 
CC R49554. 

SQ Sequence 244 Aft? Q Z; 17 c , 4 H J 

S ?; r; a J! = i: " 5; u B F ; u 5; i. S; « « , 7 „ „ 

,..,*» I*.tit, - « "J^,,, SubstltuUMS . 0 

Gaps 

VF«L..FP.UL.«I»U* I WMV.F.T^ 

110 120 130 l«u 

x 10 20 X 

KRDVDLFLTGTPDE— YVEQVAQYKALPV 

A!0 P NFFNN PTV F PTSO UE «N P a RER V^l P ^LipiikaAUT f Js RR iA|.,C 8FF A F KCV L P 

180 l'O 200 X 210 

> 0 < 

0| |Q IntelliGenetics 

> 0 < 

FastDB - Fast Pairwise Comparison of Sequences 
Release 5.4 

Results file 1-pir.res »ade by on Fri 24 Mar 95 7-.46-.05-PST. 

Query sequence being compared* US-aa-30O-510-l - (1-27) 
Number of sequences searched: 
Nunber of scores above cutoff; 

Results of the initial comparison of US-08-300-510-1 (1-27) uith: 
Data bank : PIR 43, all entries 



100000- 
N 

U50000- 

H 

8 

E 

R 

0 
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S 
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S 

U 
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SCORE Oj 
STDEV -1 
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II 


1 3| | 6! 




0 1 2 3 


5 



n ! i! ! li l\ { A \ is ii 2! 24 27 



7 8 ? 



PARAHETERS 



Similarity matrix 
Mismatch penalty 
Gap penally 
Gap size penalty 
Cutoff score 
Randomization group 



Unitary 
i 

1.00 
0,05 
0 
0 



Initial scores to save 
Optimized scores to save 



40 
0 



K-tuple 

Joining penalty 
Window size 



Alignments to save 
Display context 



2 
20 
27 



15 
100 



Scores' 



Tines : 



SEARCH STATISTICS 

(lean Median 
3 4 



CPU 
00?01 501.07 



Standard Deviation 
1.38 

Total Elapsed 
00:01:03.00 



22468834 
75511 
4166 



Number of residues; 
Number of sequences searched? 
Number of scores above cutoff: 

Cut-off raised to 3. 
Cut-off raised to 4. 
Cut-off raised to 5. 
Cut-off raised to 6. 
Cut-off raised to 7. 

The scores belou are sorted by initial 
Significance is calculated based on initial score, 



A 100X similar sequence to the query sequence was found: 
Sequence Name Description 



Init. Opt, 
Length Score Score Sig. Frame 



1. A53283 



major 



cat allergen Fel d I al 



40 



27 



27 17.44 



The list of other best scores is? 



Sequence Name Description 



Init. Dpt. 
Length Score Score Sig. Frane 



2. JC1126 

3. JC1136 

4. ISUTTB 

5. PN0644 

6. LNPG1 

7. S15199 

8. F53275 

9. K5RBV 

10. A20968 

11. S43188 

12. JS0618 

13. A38233 

14. S04405 

15. S21394 

16. S37652 

17. D38664 

18. SQ5510 

19. A35630 

20. S34494 

21. S23088 

22. JQ0148 
£3. A36128 

24. S13822 

25. S12785 

26. A37807 

27. A48788 

28. A45737 

29. PQ0470 

30. S41376 

31. S22967 

32. S22965 

33. S18737 

34. S32899 

35. S10639 

36. B27211 

37. S04035 

38. WHBE56 

39. S44250 

40. S35548 



**** 16 standard deviations above mean 
major allergen chain 1 precur 88 
major allergen chain 1 precur 92 

**** 5 standard deviations above mean 
triose-phosphate isomerase (E 250 

4 standard deviations above mean 
hypothetical protein 66 - Str 66 
pulmonary surfactant protein 79 
hydrogenase isozyme hypC - Es 
kappa 1 b95 al lotype=constant 
Ig kappa chain C region (B5 v 
Ig kappa-lb5 chain C region - 
orotidine-5'-phosphate decarb 
glutathione transferase (EC 2 
triose-phosphate isomerase (E 
hydroxyneurosporene synthase 
transposase - Mycobacterium t 
FVT1 protein - human 
glucan endo-1 ,3-beta-D-glucos 
glucan endo-1 ,3-beta-D-glucos 
regulatory protein algR3 - Ps 
ccsA protein - Euglena gracil 
ccsA protein - Euglena gracil 
hypothetical 34. 4K protein^ 
regulatory protein algP 
protein Z4 - barley 
protein ch-42 precursor, chlo 
3-phosphoshikimate 1-carboxyv 
leucyl aminopeptidase (EC 3.4 
phenylcarbamate hydrolase - A 
probable leucyl aminopeptidas 
leucine aminopeptidase - pota 
polyphenol oxidase precursor 
polyphenol oxidase precursor 
gag polyprotein - simian foam 
f erric-pseudobactin receptor 
fruB protein - Rhodobacter ca 
virA protein - Agrobacter ium 
virA protein - Agrobacterium 
infected cell protein ICP18.5 
integrin alpha 5 chain - nous 
DNA-directed RNA polymerase ( 



Pse 



90 

104 

104 

105 

232 

244 

253 

281 

308 

332 

334 

334 

340 
348 
348 
351 
352 
399 
424 
450 
469 
493 
554 
573 
604 
630 
647 
809 
827 
829 
829 
850 
1053 
1210 



»*## 

26 

26 

#»** 
10 

#»** 
9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 



26 
26 



16.71 
16.71 



10 5.09 



9 

9 
12 
11 
11 
11 

9 
10 

9 

10 

10 
9 
9 
9 
9 
9 
9 
9 
9 
9 
9 
9 
9 
11 
10 
10 
9 
9 

10 
9 

10 
9 
9 

10 
9 

10 



4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 

4.36 



0 
0 



0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 



1. US-08-300-510-1 (1-27) . , r „ nri 

A53283 major cat allergen Fel d I alpha chaxn - cat (frag 



ENTRY 
TITLE 
ORGANISM 
DATE 



A53283 fttype fragment 

major cat allergen Fel d I alpha chain - cat (fragment 
tfomal name Felis silvestris catus icommon name domestic cat 
12-May-1994 ftsequence_revision 12-«ay-1994 8text_change 
12-May-1994 



^Sls 0.... Crr.lr.. ^ »Uti. C.I P.... F.. U*Td.r.. 

allergen Felis domesticus I. 

^accession A53283 

ftftstatus preliminary 

8ftmolecule_type protein 

^residues 1-40 Mlab.1 DUF 

SUMMARY tlwoth 40 3032 

SEQUENCE 

27 Optimized Score = 27 Significance = 17.44 
Initial Score - = 27 Mismatches = 0 

Residue Identity = 100 Sub5titution5 - 0 

Gaps 

10 20 



KRDVDLFLTGTPDEYVEQVAQYKALPV 

INI I I Mill! Mil MM MMIII 
EICPAVKRDVDLFLTGTPDEYVEQVAQYKALPVVLENARI 

X 10 20 30 X *u 



ENTRY 
TITLE 
ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
^authors 

^journal 
Stitle 

^accession 
Mmolecule 
##residues 
GENETICS 
8gene 
#introns 
FEATURE 
1-18 
19-88 

SUMMARY 
SEQUENCE 



JC1126 Stype complete 

naJ or al Urge; . chain 1 [^^^^Ln.name domestic cat 

31-Dec-1993 
JC1126 

Smth, I.J.I Craig, 8.; Pollock. J.I Yu. X.B.i 

Morgenstern, J . P • » Rogers. B.L. 

the major allergen from the domestic cat. 

JC1126 
type DNA 

1-88 8#label GRI 

Chi 

17/15 79/3 

ftproduct major allergen chain 1 ^status pr«u 
SlengthTs Smolecular-ueight 9586 Checksum 4095 



Initial Score 
Residue Identity = 
Gaps s 



26 Optimized Score = 26 Significance 
9« Matches « 26 Mismatches 

0 Conservative Substitutions 

x 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 



16.71 
1 

0 



10 20 X 30 



ENALSVLDKI YTSPLC 
80 



3. US-08-300-510-1 (1-27) 

JC1136 major allergen chain 1 precursor A - cat 



JC1136 Stype complete 

major allergen chain 1 precursor A - cat 

Sformal name Felis silvestris catus »common_name domestic cat 
31-Dec-1993 fcsequencej-evision 31-Dec-1993 itext.change 

31-Dec-1993 
JC1136 
JC1126 

Griffith. I.J.; Craig. S.J Pollock. J.J Yu. X.B.J 

Morgenstern. J. P.. Rogers. B.L. 
Gene (1992) 113:263-268 

Expression and genomic structure of the genes encoding Fdl, 

the major allergen from the domestic cat. 
JC1136 
type DNA 

1-92 Sftlabel GRI 

Chi 

21/1J 83/3 

ftdomain signal sequence «status predicted ftlabel SIG\ 
iproduct major allergen chain 1 status predicted ftlabel 
MAT 

Slength 92 ftmolecular-ueight 10072 ttchecksum 4988 

26 Optimized Score = 26 Significance = 16.71 
96X Matches « 26 Mismatches = 1 

b 0 Conservative Substitutions = 0 



ENTRY 
TITLE 
ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
#authors 

^journal 
Stitle 

ftaccession 
Mmolecule 
##residues 
GENETICS 
Sgene 
#introns 
FEATURE 
1-22 
23-92 

SUMMARY 
SEQUENCE 

Initial Score 
Residue Identity 
Gaps 



MKGACVLVLLWAALLLISGGNCEICPAVKRDVDLFLTGTPDEYVEQVAQYNALPVVLENARILKNCVDAKMT 

30 40 50 X 60 70 



X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 

I I I I 



10 



20 



EEDKENALSVLDKI YTSPLC 
80 90 



4. US-08-300-510-1 (1-27) 

ISUTTB triose-phosphate isomerase (EC 5.3.1.1) - Trypanos 



Trypanosoma brucei 



ISUTTB Stype complete 
triose-phosphate isomerase (EC 5.3.1.1) 
triosephosphate mutase 
^formal name Trypanosoma brucei 

31-Dec-1991 (Jsequence.revision 31-Dec-1991 8text_change 

30-Jun-1993 
A25110J A25186 

A25110 „ _ , 

Suinkels. B.W.J Gibson, W.C.J Osinga. K.A.J Kramer, R., 

Veeneman. G.H.J van Boom, J.H.J Borst, P. 
EMBO J. (1986) 5J1291-1298 

Characterization of the gene for the nicrobody (glycosomal) 
triosephosphate isomerase of Trypanosoma brucei. 
Scross-references MUIDJ86274631 
(laccession A25110 
88molecule_type DNA 
#8residues 1-250 88label SWI 

OScross-references GBJX03921 
COMMENT This enzyme catalyzes the interconversion of glyceraldehyde 
3-phosphate and d ihydroxyacetone phosphate. 



ENTRY 
TITLE 

ALTERNATE_NAMES 
ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
tfauthors 

#journal 
Uitle 



TL«t>bir- i*-Hi iun 
KEYWORDS 



FEATURE 

2-250 

95,167 
SUMMARY 
SEQUENCE 

Initial Score 
Residue Identity 



ffsuppri am ny ir lose-pnospnate isonerase 

fatty acid biosynthesis; gluconeogenes is ; glycolysis? 

homodimer; intramolecular oxidoreductase f isomerasel 

pentose phosphate pathway 

tproduct triose-phosphate isomerase #label MAT\ 
Sactive site His, Glu #status predicted 
Slength 250 "ftnolecular-ueight 26920 ftchecksum 4834 



10 Optimized Score 
387. Matches 



10 Significance = 5.09 
10 Mismatches = 16 



Gaps = o Conservative Substitutions 

ACIGETLQERESGRTAVVVLTQIAAI AKKLKKADWAKVVI AYEPVWAIGTGKVATPQQAQEAHALIRSWVSS 
130 140 150 160 170 180 190 

X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 

III II I MM 

KIGADVRGELRILYGGSVNGKNARTLYQQRDVNGFLVGGASLKPEFVDI IKATQ 
200 210 220 X 230 240 250 



5. US-08-300-510-1 (1-27) 

PN0644 hypothetical protein 66 - Streptomyces coelicolor 



PN0644 #type fragment 

hypothetical protein 66 - Streptomyces coelicolor (fragment) 
Sformal name Streptonyces coelicolor 

03-May-1994 #sequence_revision 03-May-1994 Uent.change 

G3-Kay-1994 
PN0644 
JN0831 

Wray Jr., L.V.? Fisher, S.H. 

Gene (1993) 130:145-150 . 
The Streptonyces coelicolor glnR gene encodes a protein 

similar to other bacterial response regulators. 
PN0644 
#«nolecule type DNA 
ttftresidues 1-66 #8label WRA 

ft#cross-ref erences GB:L03213 

GENETICS 

istart codon GTG 
SUMMARY Slength 66 {^checksum 9954 

SEQUENCE 



ENTRY 
TITLE 
ORGANISM 
DATE 

ACCESSIONS 

REFERENCE 
^authors 
# journal 
Hitle 

#access ion 



Initial 
Residue 
Gaps 



Score = 
Identity = 



9 Optimized Score = 9 Significance = 

33X Matches = 9 Mismatches = 

0 Conservative Substitutions = 



4.36 
18 
0 



X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 



MAKVTRDDVARLAGTSTAVVSYVINNGPRPVAPATRERVLAAIKELGYRPDRVAQAMASRRTDLIG 

30 40 50 60 



10 



20 



US-08-300-510-1 (1-27) 

LNPG1 pulmonary surfactant protein 9K form 



pig 



ENTRY 
TITLE 

ALTERNATE.NAMES 

ORGANISM 

DATE 



LNPG1 #type complete 

pulmonary surfactant protein 9K form - pig 

low molecular mass surfactant protein type 1 ... 

ftformal name Sus scrofa donestica ftcommon.name domestic pig 

31-Dpc-1991 #s*quence revision 31-Dec-1991 tttext.change 



ACCESSIONS 
REFERENCE 
•authors 

•journal 
•title 



S00363 
S00363 

Curstedt, T.? Johanssorw J-i Barros-Soederl mg r J.; 

Robertsonr B.J Nilssonr G.s Westberg, M.J Joernvallr H. 
Eur. J. Biochem. (1988) 172^521-525 

Lou-molecular-mass surfactant protein type 1. The priwary 
structure of a hydrophobic 8-kDa polypeptide with eight 
half-cystine residues, 
•cross-references MU ID s 881 66729 
•accession S00363 

••molecule type protein 
• •residues i-79 Mlabel CUR 

COMMENT Pulmonary surfactant protein is a phosphol ipid-protem "«plex. 

which reduces surface tension at the air-liquid interface of the 
alveoli and thus facilitates gaseous exchange. 
CLASSIFICATION #superfamily pulmonary surfactant protein B . . , 

alveolar proteinosis? gaseous exchange; lipoprotein? lungr 

pulnonary surfactant; respiratory distress syndrome 
•length 79 #molecul ar-ueight 8714 ^checksum 5695 



KEYWORDS 



SUMMARY 
SEQUENCE 



Initial 
Res idue 
Gaps 



Score 
Identity 



9 

337. 
0 



Optimised Score * 9 
Matches = 9 

Conservative Substitutions 



Significance = 4 
Mismatches - 



.36 
18 
0 



X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 

FPIPLPFCHLCRTLIK^ 

10 X 20 30 40 X 50 60 70 



LVLRCSS 



US-08-300-510-1 (1-27) 

S15199 hydrogenase isozyme hypC - Escherichia coli 



Boehmr R.r Sauiersi G.; 



S15199 #type complete 

hydrogenase isozyme hypC - Escherichia coli 
•formal name Escherichia coli 

21-Nov-1993; •sequence, revision 21-Nov-1993; •text.change 

21-Nov-1993 
S15199 
S15197 

Lutzr S.J Jacobir A.; Schlensog» 

Boeckr A. 
Mol. Microbiol. (1991) 5;123-135 

Molecular characterization of an operon (hyp) necessary for 
the activity of the three hydrogenase isoenzymes in 
Escherichia coli. 
•cross-references MOID *91 194542 
•accession S15199 

••status preliminary 
••residues 1-90 ##label LUT 

••cross-references EMBL?X54543 
SUMMARY Slength 90 •molecular-weight 9732 ^checksum 8904 

SEQUENCE 



ENTRY 
TITLE 
ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
•authors 

•journal 
•title 



Initial Score 
Residue Identity 
Gaps 



9 

417. 
7 



Optimized Score = 12 
Matches = 14 

Conservative Substitutions 



Significance 
Mismatches 



.36 
13 
0 



x 10 20 X 

KRDVDLFLTGTPDE YVEfi— VA8YKALPV 



MClGVPG9IRTIDCHQAKVDVCG16RDVDLTLVGSCDENG8PRVGflWVLVHVGFAMSVlNEAEAfiDTLDALS 



10 



20 



30 



40 



50 



60 



NMFDVEPDVGALLYGEEK 
80 90 



3. «^««-" 0 ;i p ;iTU a allotype-constant region kappa Cain - 



b 95 S aUot f y pe=crnsiant region kappa chain - rabbit 

IflK Oryctolagus cuniculus #co*Ron_na«e domestic 

02-^994 #sequence_revision 18-Nov-1994 Stext.change 

18-Nov-1994 
F53275 
A53275 

Ayadi, H.J Marche, P.N. J Cazenave, P. A. 
Innunooenetics (1991) 345201-207 

Evo^uUon of the rabbit immunoglobulin kappa chain genes. 
Scross-references MUID : 91372868 
ttaccession F53275 

ttstatus preliminary 

#ftmolecule_type DNA 

ftftresidues 1-104 *ttlabel AYA 

JJ™" sequence extracted fro« KCBI backbone 

SUMHARY tlength 104 Schecksun 7726 

SEQUENCE 



ENTRY 
TITLE 

ORGANISM 

DATE 

ACCESSIONS 
REFERENCE 
^authors 
# journal 
fttitle 



Initial Score 
Residue Identity 
Gaps 



9 Optimized Score = H Significance = 4.36 
347. Matches » 12 Mismatches » 15 

8 Conservative Substitutions 

X 10 
KRDVDLFLTGT- 



DPVAPTVLIFPPSPAELATGTATIVCV 



ANKYFPDVTVTWKVDGTTQTTGIENSRTPQNSDDCTYNLSSTLTL 



10 



20 



30 



40 



50 



60 



20 X 
PDEYVEQVAQYKALPV 

II! MM I I 
KSDEYNSHDEYICSVAQGSGSPVVQSFSRNNC 

80 90 X 100 



ENTRY 
TITLE 
ORGANISM 

DATE 

ACCESSIONS 
REFERENCE 
Sauthors 
8 journal 
#title 



K5RBV #type complete 

Ig kappa chain C region (BS variant) - rabbit 
tforMl.nan. Oryctolagus cuniculus #co«mon_name domestic 

13-^-1986 ftsequence.revision 13-Aug-1986 fttext.change 

04-NOV-1994 
A02124 
A02124 

Bernstein, K.E.J Skurla Jr., R.H.i Mage, R.G. 
Nucleic Acids Res. (1983) » ! *205-7214 
The sequences of rabbit kappa light chains of b4 ad 5 
allotypes differ more in their constant regions than in 



tneir o untrans ia^eo regions, 
•cross-references MUIB:84041515 
•contents Clone pkb5~F2 
•accession A02124 
••nolecule_type mRNA 



• •res idues 
••note 



CLASSIFICATION 

KEYWORDS 

FEATURE 

1-104 

19-87 
SUMMARY 
SEQUENCE 



1-104 ••label BER 

the cDNA from which this sequence was derived contains a 
terminator codon within the V-region coding region? 
the origin of this codon and of the differences 
between this and other sequenced b5 C regions are 
unclear? the cDNA clone was made using mRNA from 
trypanosome-inf ected b5-homozygous rabbits 

•superfamily immunoglobul in C region; immunoglobulin homology 

immunoglobul in 

•domain C region #label CRG\ 
•domain immunoglobulin homology #label I MM 
#length 104 •molecular-weight 11079 •checksum 6706 



Initial Score = 
Residue Identity = 
Gaps - 



9 Optimized Score = 11 
347. Matches = 12 

8 Conservative Substitutions 



Significance 
Mismatches 



4.36 
15 
0 



X 10 
KRDVDLFLTGT- 
I I I 

ATLAPTVLIFPPSPAELATGTATIVCVANKYFPDGTVTWGVDGKPLTTGIETSKTPGNSDDCTYNLSSTLTL 
10 20 30 40 50 60 70 



20 X 
FDEYVE8VAGYKALPV 

IN IN! II 

KSDEYNSHDEYTCSVAGGSGSPVVGSFSRKNC 
80 90 X 100 



10. US-08-300-510-1 (1-27) 

A20968 Ig kappa-lb5 chain C region 



rabbit (fragment) 



ENTRY A20968 ttype fragment 

TITLE Ig kappa-lb5 chain C region - rabbit (fragment) 

ORGANISM •formal_name Oryctolagus cuniculus •common_name domestic 

rabbit 

DATE lO-Aug-1990 #sequence_revision 10-Aug-1990 #text_change 

23-Mar-1993 
ACCESSIONS A20968 
REFERENCE A20968 

•authors Emoriner L . } Sognr J.A.J Trinhr D. * Kindtr T.J.; Max* E.E. 

•journal Proc. Natl. Acad. Sci. U.S.A. (1984) 81:1789-1793 

•title A genomic gene encoding the b5 rabbit immunoglobulin kappa 

constant region: implications for latent a 1 1 o type 
phenomenon . 
•cross-references MUID:84170387 
•accession A20968 

••status pre I iminary 

••molecule^type DNA 
••residues" 1-105 ##label EM0 
SUMMARY #length 105 •checksum 237 

SEQUENCE 



Initial Score 
Residue Identity 
Gaps 



9 Optimized Score = 11 
347. Matches = 12 

8 Conservative Substitutions 



Significance = 
Mismatches = 



4.36 

15 
0 



X 



10 



VATLAPTVLIFPPSPAEIATGTAT IVCVAWKYFPDGTVTWQV0GKPLTTGIETSKTP9NSDDCTVNLSSTLT 



10 



20 30 40 



20 X 
PDEYVEQVAQYKALPV 

III HI! il 
LKSDEYNSHDEYTCQVASGSGSPvVQSFSRKNC 

80 90 X 100 



S43188 Hype conplete _ 
orotidine-5'-phosphate decarboxylase (EC 4.1.1.23) 

Pseudononas aeruginosa 



ENTRY 
TITLE 

20-May-1994 
ACCESSIONS S4318B 

"Mission «bSitt..d to the EPIBL Data Librar,,, Apr. I l«2 

•accession S43188 

••status preliminary 
••residues 1-232 ftSlabel STR 

«cross-references EMBL'.X65613 ftchecksun 1180 

SUMMARY •length 232 Molecular-weight 243A8 ftchecusun 

SEQUENCE 

S^, . 9 r u. ; - *.r. - J ! «■» 

Residue Identity = 33/. Hatches = 0 

" = 0 Conservative Substitutions 

Gap S 

ABQLOPKLCRVKVGKELFTSCAAGIVETLRGKGFEVFLOLKFHDIPNTTAMAVKAAAEtlGVHMVNVHCSGGL 

30 40 50 60 '0 

X 10 20 X 

KRDVDLFLTGTPDEYVESVAGYKALPV 

RMMAACRETLEAFSGARPLLIGVTVLTSMEREDLAGIGLDIEPQEfiVLRLAAtAfiKAGMDGLVCSAQEAPAL 

100 UO 120 X 130 130 

KAAHPGIQLVTPGIRPAGSAODDQRRILTPRQALDAGSDYLVICRPISSAA^^ 



12. US-08-300-510-1 (1-27) y Dre curso 

JS0618 glutathione transferase (EC 2.5.1.18) Yrs precur 



ENTRY 
TITLE 
ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
Sauthors 

•journal 
•title 



8cross-r 



18-Jun-1993 
JS0618! PS0266 

ugu- K.» Nishiyana, T.s Ofcada, T.J KajiU, J.» Narihata, 

u . Wstaber T.; Hiratsuka, A.? Watabe* T ♦ 

*«i*riilar cloning and amino acid sequencing of rat I ver 

las thela glutathione S-transf erase Yrs-Yrs inactivating 
reactive sulfate esters of carcinogenic ary Methanol s . 
eferences MUID :92109741 



^access ion uauo A o 

##molecule type mRNA 

##residues 1-244 fcSlabel OGU 

^accession PS02&6 



COMMENT 
COMMENT 



Te\?Zl' Mf * £U«M1.«-I14,«40-.W..9MW..1MM ..I*.. 00 U . 
Clut.thion. transferase Yrs-Vrs is coopoud of t»o ident.cal 



chain 5 - ,. 
Glutathione transferase Yrs-Yrs belongs to class theta. 



KEYWORDS transferase 



FEATURE 
2-244 

SUMMARY 
SEQUENCE 



iproduct glutathione transferase Yrs chain #status 
experimental Blabel GLU 
tlengih 244 #molecular-ueight 27439 ichecksum 3952 



initial Score - 9 Optimized Score = 10 Significance - 4.36 

Residue Identity = 377. Matches 7. 4i . 10 Misnatches _ l J 

H p l s o Conservative Substitutions - 0 

X 10 20 X 

KRDVDLFLTGTPDEYvEQVASYKALPV 

I III I II I I I 

MGLELYLDLL5SPSRAVY IFAKKNGIPFQLRTVDLLKGQHLSEQFSQVNCLKKVPVLKDGSFVLTESTAILI 

10 20 30 40 50 X 60 /« 

YLSSKYQVADHUYPADLQARAQVHEYLGWHADNIRGTFGVLLHTKVLGPLIGVQVPEEKVERNRNSMVLAL6 
on 90 100 HO 120 130 



80 

RLEDKFLRDRAF 
150 



13. US-08-300-510-1 (1-27) 

A38233 triose-phosphate isomerase (EC 5.3.1.1) * iune v» 



fluke (Schistosoma 



A38233 #type complete 
triose-phosphate isomerase (EC 5.3.1.1) 

mansoni ) 
tr iosephosphate mutase 
Sfornal name Schistosoma nansoni 

31-Dec-1993 «sequence_revision 31-Dec-1993 ttexi.change 

31-Dec-1993 
A38233 

A38233 , A . 

Shoemaker, C.J Gross, A.', Gebremichael , A.J Harn, D. 
Proc. Natl. Acad. Sci. U.S.A. (1992) 8951842-1846 
cDNA cloning and functional expression of the Schistosoma 
nansoni protective antigen triose-phosphate isomerase. 
Scross-references MUID:92179278 
^accession A38233 
ftSmolecule type mRNA 
ftftresidues" 1-253 ##label SH0 
8#cross-ref erences NCBIP:87225 



ENTRY 
TITLE 

ALTERNATE.NAMES 
ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 

^authors 

8 journal 

*title 



##note 
CLASSIFICATION 
KEYWORDS 



SUMMARY 
SEQUENCE 



sequence extracted from NCBI backbone 
ftsuperfamily triose-phosphate isomerase , • . 

fatty acid biosynthesis; gluconeogenesis; glycolysis, 
homodimer; intramolecular oxidoreductase ; isomerase; 
pentose phosphate pathway 
ftlength 253 ftmolecular-ueight 28122 Checksum 3727 



Initial Score 
Residue Identity = 
Gaps ~ 



9 Optimized Score = 9 Significance = 4.36 

337. Matches = * Mismatches = 18 

0 Conservative Substitutions = 0 



130 



140 150 160 



X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 



NAPNGVDEKIRI I YGGSVTAANCKELAfiQHDVDGFLVGGASLKPEFTEICKARQR 
200 210 220 230 240 250 X 



14. US-08-300-510-1 (1-27) 

S04405 hydroxyneurosporene synthase 



Rhodobacter capsula 



ENTRY 
TITLE 
ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
ftauthors 
ft journal 
fttitle 



S044Q5 itype complete 

hydroxyneurosporene synthase - Rhodobacter capsulars 
ftfornal.name Rhodobacter c*p«ul*tu« 

28-Feb-1990 *sequence_revision 28-Feb-19?0 *text_cnange 

18-Jun-1993 
S04405 

Armstrong, G.A.J Albert!. M.; Leach. F.» Hearst, J.E. 

products of the carotenoid biosynthesis gene cluster of 
Rhodobacter capsulatus. 
♦cross-references MUID: 89313663 
♦accession S04405 
##molecule type DNA 
ttresidues 1-281 ftttlabel ARM 

##eross-ref erences EMBL;X52291 

GENETICS 

Sgene crtC 

KEYWORDS carotenoid biosynthesis ^checksum 8228 

SUMMARY Slength 281 #molecular-weight 31856 *cnecKSum 

SEQUENCE 



Initial Score 
Residue Identity = 
Gaps = 



9 Optimized Score = 10 Significance = 4.36 
331 Matches - " Mismatches = 16 

4 Conservative Substitutions 

X 10 

KRD VDLFLTGTPDEYVES 



HI AFIGSVFSPWYRWSGRREPSNHCCINMVTTGTDGRFTMTDRGRSALRQSRDSFQVGPSKLTWTGKELV 
10 20 30 40 50 



ID 



20 X 
VAQYKALPV 

I 

VDEWG 



Ilpklgklkgrvvltpravtgvevrltpdaghtwrpfapiadvevdlapghkwtghgyfdan^gtra 



80 



90 



100 



110 



120 



leedfsfwtwgrfplkdrtvcfydatrldrtkvalav 

150 160 170 180 



15. US-08-300-510-1 (1-27) 
S21394 transposase 



ENTRY 
TITLE 
ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 



Mycobacterium tuberculosis 



S21394 fttype complete 

transposase - Mycobacterium tuberculosis 

22-Nov-1993 
S21394 

S21394 n 



gauLnors hanani. r . i eiccote ita, t.; couzzir v., nappuai. n.i uross, 

R. 

ftsubnission submitted to the EMBL Data Library, April 1992 
^accession S21394 

ftSstatus preliminary 

*#residues 1-308 tftlabel MAR 

8#cross-ref erences EMBLJX65618 
SUMMARY #length 308 Snolecul ar-we ight 34272 Ifchecksun 8922 

SEQUENCE 

Initial Score = 9 Optimized Score = 10 Significance = 4.36 

Residue Identity = 307. Matches = 12 Mismatches = 15 

Gaps = 12 Conservative Substitutions - " 

X 

KRDVDLFL 
II I 

MTRvGVISDEFWAVVEPLMPSHEGKPGRRFSDHRLILEGIAWRFRTGSPWRDLPAEFGPMQTVWKRHHRWSL 
10 20 30 40 50 60 X 70 

10 20 X 
TGTPDEYVEQVA QYKALPV 

DGTCDEVFAHVAAVFGVDAEVAEDIEKLLSVDSTNVRAHQHSAGAARTRSPQGALSDYKKSADEPDDHAIGR 
80 90 100 X 110 120 130 140 

SRGGLTTKIHALTCQREAPVRIRLTAGQAGDNPQLLPLLDDYRHASTEYALGSTDFRLL 
150 160 170 180 190 200 

> 0 < 

0| |0 IntelliGenetics 

> 0 < 

FastDB - Fast Pairuise Comparison of Sequences 
Release 5.4 

Results file 1-spt.res made by on Fri 24 Mar 95 7:53:44-PST. 



Query sequence being compared iyS-O€-3Q0-510-l (1-27) 
Number of sequences searched; 40292 
Number of scores above cutoff: 3849 

Results of the initial comparison of US-08-300-510-1 (1-27) uith: 
Data bank t Suiss-Prot 30, all entries 

100000- 

N 

U50000- 

M 

B 

E 

R 

0 - » 
F10000- * * 

S * 

E 5000- 

Q 

U 

E - * 

N 

C 

E 

S 1000- 



500* 

100- 

50- 



10- * 



5- 



°i i 11 ii I I I il il I 1 1 1 

SCORE 0 | 3| 6| I 9 I 12 15 18 21 24 27 

STDEV -11 3 5 6 7 9 



Similarity matrix Ui 
Mismatch penalty 
Gap penalty 
Gap size penalty 
Cutoff score 
Randomization group 

Initial scores to save 
Optimized scores to save 



Scores; Mean 

3 



Times? CPU 

00:00:38.04 

Number of residues: 

Number of sequences searched'* 

Number of scores above cutoff: 

Cut-off raised to 3. 
Cut-off raised to 4. 
Cut-off raised to 5. 
Cut-off raised to 6. 
Cut-off raised to 7. 



2 
20 
27 



15 
100 



Median Standard Deviation 
5 1.31 

Total Elapsed 
00:00:40.00 

14147368 
40292 
3849 



PARAMETERS 

nitary K-tuple 

1 Joining penalty 

1.00 Window size 
0.05 
0 
0 

40 Alignments to save 

0 Display context 

SEARCH STATISTICS 



The scores below are sorted by initial score. 



b' lfriit name ia -caicuiafceo o^eo un iirund* 3 uure. 



2 1007- similar sequences 



to the query sequence were found? 



Init. Opt. 
Length Score Score Sig. Frane 



1. FELA FELCA 

2. FELB.FELCA 


MAJOR ALLERGEN 
MAJOR ALLERGEN 


T pm YPFPTIDE 
I POLYPEPTIDE 


92 
88 


27 
27 


27 


18.34 


0 


The list of other 


best scores is: 
















Description 




Length 


Init. 
Score 


Opt. 
Score 


Sig. 


Frane 



3. TPIS.TRYBB 

4. PSPB_PIG 

5. HYPC_EC0LI 

6. KAC6_RABIT 

7. FIMA_B0RPE 

8. GTTR_RAT 

9. CRTC.RHOCA 

10. FVT1 HUMAN 

11. E13B.H0RVU 

12. ALGP.PSEAE 

13. CHLI EUGGR 

14. PRTZ.HORVU 

15. FRE2.STAAU 

16. CHLI_ARATH 

17. ARQA_MYCTU 

18. PCD_ARTQX 

19. AMPL.SOLTU 

20. GAG.SFVl 

21. PUPi_PSEPU 

22. PTF1_RH0CA 

23. VIRAIAGRT9 

24. VIRA AGRT6 

25. IC1B_HCMVA 

26. RPB2_SCHP0 

27. MYSC.CAEEL 



TRIOSEPHOSPHATE ISOMERASE r GL 250 

**** 4 standard deviations above nean 
PULMONARY SURFACTANT-ASSOCI AT 
HYDROGENASE ISOENZYMES FORMAT 
IG KAPPA CHAIN B5 VARIANT C R 
FIMA PROTEIN. 

GLUTATHIONE S-TRANSFERASE YRS 
HYDROXYNEUROSPORENE DEHYDROGE 
FOLLICULAR VARIANT TRANSLOCAT 
GLUCAN END0-1.3-BETA-GLUC0SID 
TRANSCRIPTIONAL REGULATORY PR 
PROBABLE MAGNESIUM-CHELATASE 
PROTEIN Z (Z4J (MAJOR ENDQSPE 
PLASMID RECOMBINATION ENZYME 
PROBABLE MAGNESIUM-CHELATASE 
3-PHOSPHQSHIKIMATE 1-CARBOXYV 
PHENMEDIPHAM HYDROLASE (3.1.1 
CYTOSOL AMINOPEPTIDASE (EC 3. 
GAG POLYPROTEIN (CORE PQLYPRO 
FERRIC-PSEUDOBACTIN RECEPTOR 
MULTIPHOSPHORYL TRANSFER PROT 
WIDE HOST RANGE (WHR) VIRA PR 
WIDE HOST RANGE (WHR) VIRA PR 
PROBABLE PROCESSING AND TRANS 
DNA-DIRECTED RNA POLYMERASE I 
MYOSIN HEAVY CHAIN C (MHC C) . 



28. RL27_ 

29. RL25. 

30. HUPG. 

31. PCCB. 

32. MEMG. 

33. YGT2. 

34. YC08. 

35. YEGA. 

36. CQX3. 

37. SSPA. 

38. SI21, 

39. VHEL 

40. YIAQ 



PEA 
YEAST 
RHILV 
HUMAN 
METTR 
CHLPS 
YEAST 
ECOLI 
THEP3 
.ECQLI 
RAT 

"lvx 
"ecqli 



79 
90 
104 
145 
243 
281 
332 
334 
340 
348 
399 
420 
424 
450 
493 
554 
647 
809 
827 
829 
829 
850 
1210 
1947 



10 


10 


5.35 


0 


#*** 








9 


9 


4.58 


0 


9 


12 


4.58 


0 


9 


11 


4.58 


0 


9 


11 


4.58 


0 


9 


10 


4.58 


0 


9 


10 


4.58 


0 


9 


9 


4.58 


0 


9 


9 


4.58 


0 


9 


9 


4.58 


0 


9 


9 


4.58 


0 


9 


9 


4.58 


0 


9 


9 


4.58 


0 


9 


9 


4.58 


0 


9 


9 


4.58 


0 


9 


11 


4.58 


0 


9 


10 


4.58 


0 


9 


10 


4.58 


0 


9 


9 


4.58 


0 


9 


10 


4.58 


0 


9 


9 


4.58 


0 


9 


9 


4.58 


0 


9 


10 


4.58 


0 


9 


10 


4.58 


0 


9 


9 


4.58 


0 



3 standard deviations above mean »*** 



60S RIBOSOMAL PROTEIN L27. 
60S RIBOSOMAL PROTEIN L25 (RP 
HUPG PROTEIN. 

PROPIONYL-COA CARBOXYLASE BET 
METHANE MONOOXYGENASE COMPONE 
HYPOTHETICAL 21.0 KD PROTEIN 
HYPOTHETICAL 21.1 KD PROTEIN 
HYPOTHETICAL IN DCD 3'REGION 
CYTOCHROME C OXIDASE POLYPEPT 
STRINGENT STARVATION PROTEIN 
SERINE PROTEASE INHIBITOR 2.1 
PROBABLE HELICASE (ORF 2). 
HYPOTHETICAL 23.4 KD PROTEIN 



135 
136 
149 
155 
169 
182 
192 
200 
207 
212 
214 
216 
220 



8 
8 
8 
8 
8 
8 
8 
8 
8 
8 
8 
8 
8 



9 
9 
8 
9 
8 
9 
8 
8 
8 
9 
11 
9 
9 



3.82 

3.82 

3.82 

3.82 

3.82 

3.82 

3.82 

3.82 

3.82 

3.82 

3.82 

3.82 

3.82 



0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 



« L i! ^"^l^ttM" I POLYPEPTIDE CHAIN . ««• ™H PR 



ID FELA.FELCA STANDARD J PRT; 92 AA. 

AC P30438J 

DT Oi-APR-1993 (REL. 25, CREATED) 

DT OI-APR-1993 (REL. 25, LAST SEQUENCE UPDATE) 



DE MAJOR ALLERGEN I POLYPEPT IDE CHAIN 1 MAJOR FORM PRECURSOR (PEL D I, 

DE (CAT-1) (AG 4) . 

GN CHI. 

II eukary"I; S me" T 20a; chordata; vertebrata; tetrapqda; mammalia; 

oc eutheria; carnivqra. 

RN C13 

RP SEQUENCE FROM N.A., AMD SEQUENCE OF 23-92. 

RC TISSUE=SALIVARY GLAND; 

Z HQRGENSTERN J. P., GRIFFITH I.J.. BRAUER A.M., ROGERS B.L.. 

oa B0WD J.F., CHAPMAN M.D., KUQ M.-C.J 

RL PROC. NATL . ACAD. SCI . U.S.A. 88:9690-9694(1991). 

RN C21 

RP SEQUENCE FROM N.A. 

RA GR^HtS I.J.. CRAIG 8.. POLLOCK J., YU X.-B.. MORGENSTERN J. P., 

RA ROGERS B.L. . 

RL GENE 1135263-268(1992). 

RN C3] 

RP SEQUENCE OF 23-62r AND CHARACTERIZATION. 

U 9 DUFF0RT 4 0.A., CARREIRA J., NITTI G., POLO F., LOMBARDERO M.; 

RL MOL. IMMUNOL. 28 i 301-309 ( 1991 ) . 

RN C4I 

RP CHARACTERIZATION. 

RA LEITERMANN K.. QHMAN J.L. JR.; 

Ri J. ALLERGY CLIN. IMMUNOL. 74:147-153(1991). 

DISEASE: MAJOR ALLERGEN PRODUCED BY THE DOMESTIC CAT. 
lllmii- Seterotetramer COMPOSED OF TWO NON-COVALENTLY LINKED 



CC 

c 



CC " DISULFIDE-LINKEd'hETERODIMER OF CHAINS 1 AND 2. 

rr -<- TISSUE SPECIFICITY: SALIVA, AND SEBACEOUS GLANDS. 
CC ALTERNATIVE PRODUCTS : USAGE OF TWO DIFFERENT INITIATOR 

CC ' RESPONSIBLE FOR THE PRODUCTION OF TWO FORMS OF THE SIGNAL SEQUENCE 
CC OF THIS ALLERGEN SUBUNIT. 

CC -!- SIMILARITY: TO UTEROGLOBIN. 

DR EMBL; M74952; FDFELDI. 

DR PIR; JC1136; JC1136. 

DR PROSITE; PS00403; UTEROGLOBIN.!. 

DR PROSITE; P900404; UTER0GL0BIN_2 . 

KW ALLERGEN; SIGNAL; ALTERNATIVE SPLICING. 

FT MM?" 23 92 MAJOR ALLERGEN I POLYPEPTIDE CHAIN 1. 

FT DISULFID 25 25 INTERCHAIN (POTENTIAL). 

FT DISULFID 92 92 INTERCHAIN (POTENTIAL). 

FT VARIANT 51 51 K -> N. 

FT CONFLICT 5 5 R -> C (IN REF. 2 . 

FT CONFLICT 18 18 W -> S (IN REF. 2). 

FT CONFLICT 82 82 L -> V (IN REF. 2). 

SQ SEQUENCE 92 AA; 10252 MW; 43206 CN; 

initial Score = 27 Optimized Score = 27 Significance = 18.34 
Initial Scots c * = 27 Misfiatc hes = 0 

Residue Identity = 100/t riatcnes 

Gaps = 0 Conservative Substitutions - u 

X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 

MKGARVLVLLWAALLLI WGGNCEICPAVKRDVDLFLTGTPDEYVEQV^ 

10 20 30 40 50 X 60 



EEDKENALSLLDKIYTSPLC 
80 90 



FELB FELCA MAJOR ALLERGEN I POLYPEPTIDE CHAIN 1 MINOR FORM PR 



ID 

AC 

DT 

DT 

DT 

DE 

DE 

GN 

OS 

OC 

OC 

RN 

RP 

RM 

RA 

RA 

RL 

RN 

RP 

RM 

RA 

RA 

RL 

RN 

RP 

RM 

RA 

RL 

RN 

RP 

RA 

RL 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

OR 

DR 

DR 

DR 

KW 

FT 

FT 

FT 

FT 

FT 

FT 

SQ 



STANDARD; 



PRT; 



88 AA. 



FELB_FELCA 
P30439; 
01-APR-19?3 
01-APR-1993 
01-JUN-1994 
MAJOR ALLERGEN 
(CAT-1) (AG 4) . 
CHi . 

FELIS CATUS (CAT) . 
EUKARYOTAJ METAZOA; CHORDATA 
EUTHERIA! CARNIVORA. 
Ell 

SEQUENCE FROM N.A.. AND SEQUENCE OF 19-88. 
92052157 

MORGENSTERN J. P.. GRIFFITH I 
BOND J.F., CHAPMAN M.D.. KUO 
PROC. NATL. ACAD. SCI. U.S. A 
C2] 

SEQUENCE FROM N.A. 
92241678 

GRIFFITH I.J.r CRAIG S 
ROGERS B.L.. 
GENE 113:263-268(1992) . 

[3] 

SEQUENCE OF 19-58, AND CHARACTERIZATION 
91287714 

DUFFORT O.A.r CARREIRA 



(REL. 25, CREATED) 
(REL. 25, LAST SEQUENCE UPDATE) 
(REL. 29, LAST ANNOTATION UPDATE) 

I POLYPEPTIDE CHAIN 1 MINOR FORM PRECURSOR (FEL D I) 



vertebrata; tetrapoda; mammalia; 



J., BRAUER A.W., ROGERS B.L. 

m . -c . ; 

88:9690-9694(1991) . 



POLLOCK J., YU X.-B.r MORGENSTERN J.P, 



J., NITTI G. , POLO F., LOMBARDERO M, 



28:301-309(1991) 



MQL. IMMUNOL 
£41 

CHARACTERIZATION. 
LEITERMANN K. , OHMAN J.L. 
J. ALLERGY CLIN. IMMUNOL 



JR. ; 

74:147-153(1991) . 



_ I - 



DISEASE : MAJOR ALLERGEN PRODUCED BY THE DOMESTIC CAT. 
SUBUNIT: HETEROTETRAMER COMPOSED OF TWO NON-COVALENTLY LINKED 
DISULFIDE-LINKED HETERODIMER OF CHAINS 1 AND 2. 
TISSUE SPECIFICITY: SALIVA, AND SEBACEOUS GLANDS 



-!- ALTERNATIVE PRODUCTS: USAGE OF 
RESPONSIBLE FOR THE PRODUCTION 
OF THIS ALLERGEN SUBUNIT . 
-!- SIMILARITY: TO UTEROGLOBIN. 
EMBL; M74953; FDFELDIB. 
PIR; JC1126; JC1126. 
PROSITE; PS00403; UTEROGLOBIN,!. 



TWO DIFFERENT INITIATOR MET ARE 
OF TWO FORMS OF THE SIGNAL SEQUENCE 



PROSITE; PS00404; 
ALLERGEN J SIGNAL; 



SIGNAL 

CHAIN 

DISULFID 

DISULFID 

VARIANT 

CONFLICT 

SEQUENCE 



88 



1 

19 
21 
88 
47 
78 

aa; 



UTEROGLOBIN 
ALTERNATIVE 

18 

88 

21 

88 

47 

78 
9614 



MW'r 



2. 

SPLICING. 

MAJOR ALLERGEN I POLYPEPTIDE CHAIN 1 
INTERCHAIN (POTENTIAL) . 
INTERCHAIN (POTENTIAL). 
K -> N. 
L -> V (IN 
39445 CN; 



REF. 2) 



Initial Score 
Residue Identity 
Gaps 



27 Optimized Score = 27 Significance 
1007. Matches = 27 Mismatches 

0 Conservative Substitutions 



18.34 
0 
0 



X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 

MLDAALPPCPTVAATADCEICPAVKRDVDLFLTGTPDEYV^ 

10 20 X 30 40 50 60 70 



80 



3 ' m^TRYB^ ISOMERASE r GLYCOSOMAL (EC 5.3.1.1) 

ID TPIS.TRYBB STANDARD; PRT; 250 AA. 

AC P047S9; 

DT 13-AUG-1987 (REL. 05, CREATED) 

DT 13-AUG-1987 (REL. 05, LAST SEQUENCE UPDATE) 

DT Gl-JUN-1994 (REL. 29, LAST ANNOTATION UPDATE) 

DE TRIOSEPHOSPHATE ISOMERASE, GLYCOSOMAL (EC 5.3.1.1) (TIM). 

It EUKARYOTA^PROTOZOA^SARCOHASTIGOPHORA! KASTIGOPHORAi KINETOPLASTlDAt 

OC TRYPANOSOMAS DAE. 

RN [11 

RP SEQUENCE FROM N.A. 

RA SlS B.W., GIBSON W.C., OSINGA K.A.. KRAMER R.. VEENEMAN G.H., 

RA VAN BOOM J.H.f BORST P.*. 

RL EMBO J. 5:1291-1298(1986). 

RN [23 

RP SEQUENCE. 

RM 86187863 

RA BORST P. f 

RL BIQCHIM. BIOPHYS. ACTA 866:179-203(1986). 

RP X-RAY CRYSTALLOGRAPHY (2.4 ANGSTROMS). 

RM 88118904 

RA UIERENGA R.K., KALK K.H., HOL W.J. G.J 

RL J. MQL . BIOL. 198:109-121(1987). 

RP X-RAY CRYSTALLOGRAPHY (1.83 ANGSTROMS). 

RA w'lERENGA R.K.. NOBLE M.E.H., VRIEND G., NAUCHE S., HOL H. J.C.I 
RL J. MOL . BIOL. 2201995-1015(1991) . 
RN C5 3 

RP X-RAY CRYSTALLOGRAPHY. 
RM 92235847 

UIERENGA R.K., NOBLE M.E.M., DAVENPORT R.L.J 



RA 



s -"catalytic s?;ji":"-^nii»«'« ■ 

CC ACETONE PHOSPHATE. 

" I! "I "an'Ihportakt role .. SEVERAL aetabouc pathways 



CC -in. .....i . - ... • 

CC -'- SUBCELLULAR LOCATION: GLYCOSOMAL. „. Lnt > cn 

CC -«- THE ENZYME CONTAINS A HIGH PROPORTION OF POSITIVELY-CHARGED 
cc ' RESIDUES IN BETA-BARRELS V & VII (COMPARED TO THE HOMOLOGOUS 
CC REGIONS IN OTHER TRIOSE ISOMERASE SEQUENCES) . SINCE 2 CLUSTERS 

OF + CHARGES LOCATED AT PRECISE DISTANCES ON THE MOLECULAF j SU A 
CC ARE COMMON TO 4 GLYCOSOMAL ENZYMES, [13 SPECULATES THAT THIS MIGHT 

CC REPRESENT A SIGNAL FOR ENTRY INTO GLYCOSOMES. 

DR EMBLf X03921J TBTIM. 

DR PIR: A25110! ISUTTB. 

DR PDB; 3TIM; 15-0CT-91. 

DR PDB; 4TIM; 15-0CT-92. 

DR PDB ; 5TIM; 15-0CT-92. 

DR PDB! 6TIM; 31-JAN-94. 

DR PDB; 1TRD; 31-0CT-93. 

DR PDB; 1TS1; 31-JAN-94. 

II t^ERA^TLYCOLYs!"; GLUCC—S.S, FATT, .«> ".-.SYNTHESIS; 
KM PENTOSE SHUNT ! GLYCOSOME; 3D-STRUCTURE. 
FT ACT SITE 95 95 
FT ACT SITE 167 167 



F\ 




7 


i i 


FT 


STRAND 


14 


14 


FT 


HELIX 


18 


30 


FT 


STRAND 


38 


43 


FT 


TURN 


46 


47 


FT 


HELIX 


48 


54 


FT 


TURN 


58 


59 


FT 


STRAND 


60 


64 


FT 


STRAND 


68 


68 


FT 


STRAND 


72 


72 


FT 


TURN 


75 


76 


FT 


STRAND 


79 


79 


FT 


HELIX 


30 


85 


FT 


TURN 


86 


87 


FT 


STRAND 


90 


93 


FT 


HELIX 


96 


101 


FT 


HELIX 


106 


118 


FT 


TURN 


119 


120 


FT 


STRAND 


122 


127 


FT 


HELIX 


131 


135 


FT 


TURN 


. 136 


137 


FT 


HELIX 


139 


151 


FT 


TURN 


152 


153 


FT 


HELIX 


156 


161 


FT 


STRAND 


162 


166 


FT 


HELIX 


169 


171 


FT 


HELIX 


180 


197 


FT 


TURN 


198 


198 


FT 


HELIX 


200 


205 


FT 


STRAND 


207 


210 


FT 


HELIX 


216 


223 


FT 


TURN 


224 


224 


FT 


TURN 


226 


227 


FT 


STRAND 


230 


233 


FT 


HELIX 


235 


238 


FT 


TURN 


240 


241 


FT 


HELIX 


242 


247 


FT 


TURN 


248 


249 


SQ 


SEQUENCE 


250 AA; 


26' 



Initial Score = 10 Optimized Score = 10 Significance = 5.35 
Residue Identity = 38X Matches = 10 Misnatches = 16 

Gaps = 0 Conservative Substitutions = 0 

ACIGETLQERESGRTAVVVLTQIAAIAKKLKKADWAKVVIAYEPVWAIGTGKVATPQQASEAHALIRSWVSS 
130 140 150 160 170 180 190 

X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 

I ! I Ml I I II 

KIGADVRGELRILYGGSVNGKNARTLYQQRDVNGFLVGGASLKPEFVDI IKATQ 
200 210 220 X 230 240 250 



4. US-08-300-510-1 (1-27) 

PSPB_PIG PULMONARY SURFACTANT-ASSOCIATED PROTEIN B (SP-B) ( 

ID PSPB_PIG STANDARD; PRTf 79 AA. 

AC P15782; 

DT 01-APR-1990 (REL. 14, CREATED) 

DT 01-APR-1990 (REL. 14, LAST SEQUENCE UPDATE) 

DT 01-JUN-1994 (REL. 29, LAST ANNOTATION UPDATE) 

DE PULMONARY SURF ACT ANT- ASSOC I A TED PROTEIN B (SP-B) (8 KD PROTEIN) 

DE (PULMONARY SURFACTANT-ASSOCIATED PROTEOLIPID SPL(PHE)). 

OS SUS SCROFA (PIG) . 

OC EUKARYOTA J METAZOA; CHORDATA J VERTEBRATA TETRAPODA; MAMMALIA; 



RN 
RP 
RM 
RA 
RA 
RL 
RN 
RP 
RM 
RA 
RL 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
DR 
KW 
FT 
FT 
FT 
FT 
FT 
SQ 

Init 
Resi 
Gaps 



til 

SEQUENCE. 
88166729 
CURSTEDT T 
NILSSON G. 



, JOHANSSON 
WESTBERG H. 



., BARROS-SOEDERLING 
JQERNVALL H . > 



J„, ROBERTSON B. r 



J. BIOCHEM. 172:521-525(1988) 



EUR. 
C2] 

DISULFIDE BONDS. 

9 1 299745 

JOHANSSON J.- CURSTEDT T., JOERNVALL H.J 

™.KW«T«T CONSISTS OF ,.I "'»»!«« ^AR^OHYDRATE- 

adc a ciibpatt ANT ASSOCIATED PROTEIN'. 2 COLLAGENOUS, CARBOHYUKft t 
^MnHlYC^ROTe"" .SP-A AND IP-I. m 2 SHALL HYDROPHOB.C 

PROTEINS (SP-B AND SP-C) . 
SUBCELLULAR LOCATION: EXTRACELLULAR . 
SUBUNIT: HOMQDIMER. DISULFIDE-LINKED . 
S00363', LNPG1. 

- EXCHANGE. 



- i - 



P I R i 

SURFACE FILM; LUNG 

DISULFID 3 

DISULFID li 

DISULFID 35 

DISULFID 48 

VARIANT 57 

SEQUENCE 79 AA 3 



GASEOUS 
77 
71 
46 
48 
57 

8714 MW; 



INTERCHAIN. 
C -> L. 
33297 CN'r 



ial Score = 
due Identity = 



FPIPLPFCWLCRTLIKR 

10 X 20 



9 Optimized Score » 9 Significance = 4.58 

337. Hatches - * Mismatches = 18 

0 Conservative Substitutions 

X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 

IMVVPKGV^ 

40 X 50 60 /y 



30 



LVLRCSS 



5 " HrPrE 3 C 0 0°LI 510 H{D^NASE ISOENZYMES FORMATION PROTEIN HYPC. 

ID HYPC_ECOLI STANDARD; PRT'» 90 AA. 

AC P24191; 

DT Ol-MAR-1992 (REL. 21 » CREATED) 

DT Ot-HAR-1992 (REL. 21, LAST SEQUENCE UPDATE) 

DT 01-JUL-1993 (REL. 2b, LAST ANNOTATION UPDATE) 

DE HYDROGENASE ISOENZYMES FORMATION PROTEIN HYPC. 

GN HYPC. 

f C SSSw?" SS^LICUTEB, SCOTOBACTE^.A, FACULTATIVELY ANAEROBIC BODS, 

OC ENTEROBACTERIACEAE. 

RN C 1 ] 

RP SEQUENCE FROM N.A. 

r"a IwT. JACOB! A.. SCHLENSOC V., BQEHM «.. SANERS C. BOECK A. > 

f C T-^wiX™™"* FORMATION OF ALL THREE HYDROGENASE 

CC ISOENZYMES . — 



DR EMBL; X54543; ECHYP. 

DR PIR; S15199J S15199. 

DR EC02DBASE; A008.0; 6TH EDITION. 

DR ECOGENE; EG10485; HYPC. 

SQ SEQUENCE 90 AA; 9732 MWf 39420 CN; 

Initial Score = 9 Optimized Score = 12 Significance = 4.58 

Residue Identity = 41X Matches = 14 Mismatches - 13 

Gaps = 7 Conservative Substitutions - u 

X 10 20 X 

KRDVDLFLTGTPDE YVEQ VAQYKALPV 

I | | | | | | II II I II 

MriGVPGQIRTIDGNQAKVDVCGIQRDVDLTLVGSCDENGQPRVGQMVLVHVGFAMSvINEAEARDTLDALQ 

10 20 X 30 40 50 60 70 

NMFDVEPDVGALLYGEEK 
80 90 



6. US-08-300-510-1 (1-27) 

KAC6 RABIT IG KAPPA CHAIN B5 VARIANT C REGION. 



ID 

AC 

DT 

DT 

DT 

DE 

OS 

OC 

QC 

RN 

RP 

RM 

RA 

RL 

CC 

CC 

CC 

CC 

CC 

DR 

DR 

KM 

FT 

FT 

FT 

SQ 



KAC6.RABIT 
P039i4; 
23-QCT-1986 
23-0CT-1986 
01-APR-1988 



STANDARD; 



PRT ; 104 AA. 



(REL. 02, CREATED) 
(REL. 02. LAST SEQUENCE UPDATE) 
(REL. 07, LAST ANNOTATION UPDATE) 
IG KAPPA CHAIN B5 VARIANT C REGION. 
ORYCTOLAGUS CUNICULUS (RABBIT). 

EUKARYQTA r METAZOA? CHORDATA I VERTEBRATA ; TETRAPODA; MAMMALIA; 
EUTHERIA; LAGOMORPHA. 

cn 

CLONE PKB5-F2, SEQUENCE FROM N.A. 
84041515 

BERNSTEIN K.E.. SKURLA R.M. JR. > MAGE R.G.; 

NUCLEIC ACIDS RES. 1 1 ; 7205-7214 ( 1983) . < .„ MTAtMe A 

-'- THE CDNA FROM WHICH THIS SEQUENCE WAS DERIVED CONTAINS A 

TERMINATOR CODON WITHIN THE V-REGION CODING REGION. THE ORIGIN 
OF THIS CODON AND OF THE DIFFERENCES BETWEEN THIS AND OTHER 
SEQUENCED B5 C REGIONS ARE UNCLEAR. THE CDNA CLONE WAS MADE 
USING MRNA FROM TRYPANOSOME- INFECTED B5-H0M0Z YGOUS RABBITS. 

PIR? A02124; K5RBV. 

PROSITE; PS00290; IG.MHC. 

IMMUNOGLOBULIN C REGION. 

NON TER 1 1 

DISULFID 26 85 

DISULFID 104 104 

SEQUENCE 104 AA; 11079 



INTERCHAIN (WITH 
MW; 62252 CN; 



A HEAVY CHAIN) 



9 

347. 

8 



Optimized Score = 11 
Matches = 12 

Conservative Substitutions 



Significance 
Mismatches 



4.58 
15 
0 



Initial Score = 
Residue Identity = 
Gaps " 

X 10 
KRDVDLFLTGT- 

I I I 

ATLAPTVL1FPPSPAELATGTAT IVCVANKYFPDGTVTWQVDGKPLTTGIETSKTPQNSDDCTYNLSSTLTL 
10 20 30 40 50 60 70 

20 X 
PDEYVEQVAQYKALPV 

III Mil M 
KSDEYNSHDEYTCQVAQGSGSPVV8SFSRKNC 



80 



90 



100 



7. US-08-300-510-1 (1-27) 

FIMA.BQRPE FIMA PROTEIN. 



FIMA.BQRPE STANDARD? PRT; 145 AA. 

AC P35076; 

DT 01-FEB-1994 (REL. 28, CREATED) 

DT 01-FEB-1994 (REL. 28, LAST SEQUENCE UPDATE) 

DT 01-FEB-1994 (REL. 28, LAST ANNOTATION UPDATE) 

DE FIMA PROTEIN. 

GN FIMA, 



ID 



11 ^S^IctucoTES, scotobacter.a; ».,c ROS A N D COCCI. 

DC ALCALIGENACEAE. 
RN C13 

RP SEQUENCE FROM N.A. 

RM 93078620 . 
RA WILLEMS R.J.. DER HEIDE H.G.. MOOI F.R., 
RL MOL. MICROBIOL. 6:2661-2671 (1992) . 
DR EMBLJ X64876; BPFIMABC. 

SQ SEQUENCE 145 AA; 15134 MW, 107653 CN, 

Initial Score = 9 Optimized Score = 11 Significance = 4.58 

Re idue Identity = 377. Matches = 12 Mxsnatches - 15 

Residue y = conservative Substitutions - ° 

X 10 20 

KRDVD LFLTGTPDEYVESVAQY 

MQLPTISRTALKDVGSTAGGTVFDVKLTECPSALNGQ6VGLFFESGGTVDYTSGNLFAYRADSSGVEQVPQT 
10 20 30 40 X oO 60 

X 

KALPV 

KADNVQANLDGSAIHLGRNKGAQAAQTFLVSSTAGSSTYGATLRYLACYIRSGAGS^ 



X 80 



90 100 HO 



8 - ^T 0 R 8 RA? 0 " 5,0 oiu^"T"o NE S-TRANSFERASE YRS-YRS (EC M.I.II. <C 



ID 



GTTR RAT STANDARD; PRT; 243 AA. 



AC P36971; 

DT 01-JUN-1994 (REL. 29, CREATED) 

DT 01-JUN-1994 (REL. 29, LAST SEQUENCE UPDATE) 

DT 01-JUN-1994 (REL. 29. LAST ANNOTATION UPDATE) 

DE GLUTATHIONE S-TRANSFERASE YRS-YRS (EC 2.5.1.18) (CLASS-THETA) . 

°o S c rrSS^!"31^, vertebrata; tetrapoda; mammalia; 

oc eutheria; rodentia. 

RP SEQUENCE FROM N.A.. AND PARTIAL SEQUENCE. 

rc tissue=liver; 

R M A 0™, N ISHIYAMA T., OKADA T . , KAJITA J., NARIHATA H., HATABE 

RA HIRATSUKA A., WATABE T.; .,„.„„., 

RL BIOCHEM. BIOPHYS. RES. COMMUN. 181:1294-1300(1991). 

RP SEQUENCE OF 1-25. AND CHARACTERIZATION. 

R M A H^TSul A., SEBATA N. , KAUASHIMA K., OKUDA H. , OGURA K., WATABE T., 



R C L c ^ ^NCmN^C^rES^ir^^SATION OF REACTIVE SULFATE ESTERS IN 

CC CARCINOGENIC ARYLMETHANOLS . HIGHEST ACTIVITY TOWARDS ETHACRYNIC 

CC ACID AND CUMENE HYDROPEROXIDE. 

CC -!- CATALYTIC ACTIVITY: RX + GLUTATHIONE = HX + R-S-G. 

CC -!- SUBUNIT: HOMODIMER. 

rr i cnnrFl l IH AR LOCATION: CYTOPLASMIC. 

CC TISSUE SPECIFICITY: HIGHEST VALUES FOUND IN LIVER FOLLOWED BY 

CC ' TESTIS r ADRENAL GLAND, KIDNEY , LUNG, BRAIN AND SKELETAL MUSCLE. 

CC -!- S^IlARITY: WITH OTHER GLUTATHIONE S-TRANSFERASES . BELONGS TO 

CC CLASS THETA. 

DR EMBL5 D10026; RNGSTYRS. 

DR PIR! JS0618; JS061B. 

DR PIR > A37069; A37069. 

KW TRANSFERASE; MULTIGENE FAMILY. 

FT INIT MET 0 0 

SQ SEQUENCE 243 AA; 27308 MWJ 285713 CN; 

initial Score - 9 Optimized Score = 10 SignH >ic.nc. . 4.58 

Residue Identity = 377. Matches Mls,t,at<:heS _ % 

Gaps = o Conservative Substitutions 

X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 

I III I II 1 11 

GLELYLDLLSQPSRAVYIFAKKNG IPFQLRTVDLLKGQHLSEQFSQVNCLKKVPVLKDGSFVLTESTAILIY 

10 20 30 40 50 X 60 iv 

LSSKYQVADHWYPADLQARAQVHEYLGWHADNIRGTFGVLLWTKVLGPLIGVQVPEEKVERNRNSMVLALQR 
80 90 100 HO 120 130 

LEDKFLRDRAF 
150 



US-08-300-510-1 (1-27) 

CRTC RHOCA HYDROXYNEUROSPORENE DEHYDROGENASE (EC 1.-.-.-) (HY 



ID 



CRTC.RHOCA STANDARD J PRT', 281 AA. 

AC P17058; 

DT 01-AUG-1990 (REL. 15, CREATED) 

DT 01-AUG-1990 (REL. 15, LAST SEQUENCE UPDATE) 

DT 01-MAY-1992 (REL. 22, LAST ANNOTATION UPDATE) v MPIJ o 0S pqreNE 

DE HYDROXYNEUROSPORENE DEHYDROGENASE (EC 1. -.-.-) (HYDROXYNEUROSPORENE 

DE SYNTHASE) . 

GN CRTC 

OS RHODOBACTER CAPSULATUS (RHODOPSEUDOMONAS CAPSULATA) . 

qc prokaryota; gracilicutes; anoxyphotobacteria; PURPLE BACTERIA, 

oc RHODOSPIRILLACEAE. 

rn m 

RP SEQUENCE FROM N.A. 

RC STRAIN=SB1003, AND BEC404J 

RM 89313663 ,„- AO eT . c . 

RA ARMSTRONG G.A., ALBERTI M. , LEACH F . , HEARST J.E., 

RL MOL. GEN. GENET. 216:254-268(1989). 

CC -!- PATHWAY: CAROTENOID AND CHLOROPHYLL BIOSYNTHESIS. 
DR EMBLJ X52291*, RCCRTAK. 
DR EMBL; Z11165; RCPHSYNG. 

'5 PH0T0 S s!N?2ESI S Sr?°HL0R0PHYLL BIOSYNTHESIS', CAROTENOID BIOSYNTHESIS J 
KW OXIDOREDUCTASE. 

SQ SEQUENCE 281 AA! 31856 MW, 405094 CN', 

Initial Score = 9 Optimized Score = 10 Significance = 4.58 

initial "ore . = U Mismatches = 16 

Residue Identity = 35A natcnes 



'j op a 



x 10 

KRD VDLFLTGTPDEYVEG 

II II I I I 

MI AFIGSVFSPWYRWSGRREPQNHCCINMVTTGTDGRFTMTDRGRSALRQSRDSFQVGPSKLTWTGKELVID 

10 20 30 40 50 60 70 

20 X 
VAQYKALPV 

UDEWGALPKLGKLKGRVVLTPRAVTGVEVRLTPDAGHTMRPFAPI ADVEVDLAPGHKWTGHGYFDANFGTRA 
80 90 100 110 120 130 140 

LEEDFSFWTWGRFPLKDRTVCFYDATRLDRTKVALAV 
150 160 170 180 



10. US-08-300-510-1 (1-27) 

FVTl.HUMAN FOLLICULAR VARIANT TRANSLOCATION PROTEIN 1 PRECURS 

ID FVTl.HUMAN STANDARD; PRT; 332 AA. 

AC Q06136? 

DT 01-JUN-1994 (REL. 29, CREATED) 

DT 01-JUN-1994 (REL. 29, LAST SEQUENCE UPDATE) 

DT 01-OCT-1994 (REL. 30, LAST ANNOTATION UPDATE) 

DE FOLLICULAR VARIANT TRANSLOCATION PROTEIN 1 PRECURSOR (FVT-1). 

GN FVT1. 

OS HOMO SAPIENS (HUMAN) . 

oc eukaryota; metazoa; chordata; vertebrata; tetrapoda; mammalia; 

oc eutheria; primates. 

RN CI 3 

RP SEQUENCE FROM N.A. 

RM 931 12945 

RA RIMOKH R., GADOUX M. , BERTHEAS M.-F., BERGER F., GAROSCIO M., 

RA DELEAGE G. , GERMAIN D., MAGAUD J. -P. J 

RL BLOOD 81:136-142(1993) . 

CC -'- SUBCELLULAR LOCATION : SECRETED (POTENTIAL). 

CC -'- TISSUE SPECIFICITY? WEAKLY EXPRESSED IN NORMAL HEMATOPOIETIC 
CC TISSUES. HIGHER EXPRESSION IN SOME T-CELL MALIGNANCIES AND PHA- 

CC STIMULATED LYMPHOCYTES. 

CC -'- DISEASE; INVOLVED IN A T(2;18) (Pll;G21) CHROMOSOMAL TRANSLOCATION 

CC WITH A IG J KAPPA CHAIN REGION THAT PRODUCES AN ONCOGENE 

CC RESPONSIBLE FOR FOLLICULAR LYMPHOMA (ALSO KNOWN AS TYPE II CHRONIC 

CC LYMPHATIC LEUKEMIA). 

DR EMBL; S51904; HSFVT1A. 

DR PIR; S37652; S37652. 

DR MIM; 136440; 11TH EDITION. 

KH PROTO-ONCOGENEJ CHROMOSOMAL TRANSLOCATION; SIGNAL. 

FT SIGNAL 1 25 POTENTIAL. 

FT CHAIN 26 332 FOLLICULAR VARIANT TRANSLOCATION 

FT PROTEIN 1. 

SO SEQUENCE 332 AA; 36187 MW; 585292 CN; 

Initial Score = 9 Optimized Score = 9 Significance = 4.58 

Residue Identity = 337. Matches = 9 Mismatches = 18 

= 0 Conservative Substitutions = 0 



Gaps 



MLLLAAAFLVAFVLLLYMVSPLISPKPLALPGAHVVVTGGSSGIGKC I AIECYKQGAFITLVARNEDKLLQA 
10 20 30 40 50 60 70 

X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 

II || III M 

KKEIEMHSINDKQVVLCISVDVSQDYNQVENVIKQAQEKLGPVDMLVNCAGMAVSGKFEDLEVSTFERLMSI 

e0 90 100 110 X 120 130 140 



NYLGSVYPSRAVITTMKERRVGRIVFVSSQAGQLGLFGFTAYSASKFAIRGLAEALQMEVKPYNVYITVAY 
ISO 160 170 180 190 200 210 



11. US-08-300-510-1 (1-27) ,_ or , 

E13B.H0RVU GLUCAN END0-1 , 3-BETA-GLUC0SIDASE GII PRECURSOR (EC 

ID E13B_H0RVU STANDARD? PRT; 334 AA. 

AC P15737J 

DT 01-APR-1990 (REL. 14, CREATED) 

DT Ol-APR-1990 (REL. 14, LAST SEQUENCE UPDATE) 

DT Ol-OCT-1994 (REL. 30, LAST ANNOTATION UPDATE) 

DE GLUCAN ENDO-1 ,3-BETA-GLUCOSIDASE GII PRECURSOR (EC 3.2.1.39) (d->3)- 

DE BETA-GLUCAN ENDOHYDROLASE) (( l->3> -BETA-GLUCANASE ISOENZYME GII). 

OS HORDEUH VULGARE (BARLEY). 

QC EUKARYOTA J PLANTA; EHBRYOPHYTA; ANGIOSPERMAE J MONQCOT YLEDONEAE ', 

OC CYPERALES *, GRAMINEAE. 

RN C 1 3 

RP SEQUENCE FROM N.A.r AND SEQUENCE OF 29-68. 

RC STRAIN=CV. CLIPPER! 

RA H0EJ ? p!b., HARTMAN D.J.. MORRICE N.A.. DOAN D.N. P., FINCHER G.B.J 

RL PLANT MOL. BIOL. 13531-42(1989). 

RN [23 

RP SEQUENCE FROM N.A. 

RC STRAIN=CV. PIGGY: 

RM 91107649 

RA LEAH R. , TOMMERUP H.. SVENDSEN I., MUNDY J. J 

RL J. BIOL. CHEM. 2665 1564-1573(1991). 

RN 131 

RP SEQUENCE OF 258-332 FROM N.A. 

RC TISSUE=LEAFJ 

RA JUTIDAMRONGPHAN W., MACKINNON G., MANNERS J., SIMPSON R.S., 

RA SCOTT K.J.5 

RL SUBMITTED (AUG-1989) TO EMBL/GENBANK/DDBJ DATA BANKS. 
RN C41 

RP SEQUENCE OF 29-334. 

RA BALLANCE G.M., SVENDSEN 1.5 

RL CARLSBERG RES. COMMUN. 53:411-419(1988). 

CC -'- FUNCTION: MAY PROVIDE A DEGREE OF PROTECTION AGAINST MICROBIAL 

CC ' INVASION OF GERMINATED BARLEY GRAIN THROUGH ITS ABILITY TO DEGRADE 

CC FUNGAL CELL MALL POLYSACCHARIDES. 

CC -!- CATALYTIC ACTIVITY: HYDROLYSIS OF 1 , 3-BETA-D-GLUCOSIDIC LINKAGES 
CC IN 1 , 3-BETA-D-GLUCANS . 

CC -!- SIMILARITY: BELONGS TO FAMILY 17 OF GLYCOSYL HYDROLASES. 

OR EMBL5 X15205: HV13BGE. 

DR EMBL? M62907*, HVCBGL32 . 

DR EMBL 5 X162745 HVB13GLU. 

DR EMBL J M23548J HVGEH. 

DR PIR! S05510; S05510. 

DR PIR', A31800J A31B00. 

DR PR0SITE5 PS005875 GLYC0SYL_HYDR0L_F17 . 

KM HYDROLASE; GLYC0SIDASE5 SIGNAL; MULTIGENE FAMILY. 

FT CHAIN 1 " 29 334 GLUCAN ENDO-l ,3-BETA-GLUCOSIDASE GII. 

FT ACT SITE 246 246 POTENTIAL. 

FT ACT~SITE 259 259 POTENTIAL. 

FT CONFLICT 12 12 A -> V (IN REF. 2). 

FT CONFLICT 71 71 L -> V (IN REF. 2). 

SQ SEQUENCE 334 AA » 35193 M; 549197 CN; 

Initial Score = 9 Optimized Score = 9 Significance = 4.58 

Residue Identity = 33X Matches = 9 Mismatches = 18 

Ga p S = 0 Conservative Substitutions - « 



130 140 150 160 170 180 ivu 

X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 

ANVYPYFAYRDNPGSISLNYATFQPGTTVRDQNNGLTYTSLFDAMVDAVYAALEKAGAPAVKVVVSESGWPS 

200 210 220 230 240 2o0 260 

AGGFAASAGNARTYNQGLINHVGGGTPKKREALETYIFAMFNENQKTGDATERSFGLFNPDKSPAYNIQF 
270 280 290 300 310 320 330 



12 ' ALGP^PSEAE^^ TRANSCRIPTIONAL REGULATORY PROTEIN ALGP (ALGINATE 

ID ALGP.PSEAE STANDARD; PRT; 340 AA. 

AC P15276; 

DT 01-APR-1990 (REL. 14, CREATED) 

DT 01-APR-1990 (REL. 14, LAST SEQUENCE UPDATE) 

. T „, nrT-HJUT <REL 27r LAST ANNOTATION UPDATE) 

11 T^NS^PnO^ REGULATORY PROTEIN ALGP (ALGINATE REGULATOR* PROTEIN 

DE ALGR3). 

GN ALGP QR ALGR3. 

OS PSEUDQMONAS AERUGINOSA. rnprI . 

oc prokaryota; gracilicutes; scotobacteria; AEROBIC RODS AND COCCI, 

OC PSEUDOtlONADACEAE. 

RN m 

RP SEQUENCE FROM N.A. 

Rfi 90106714 

RA KATO J., CHU L., KITANO K.» DEVAULT J.D.r KIMBARA K. , 

RA CHAKRABARTY A.M. , HISRA T.K.J 

RL GENE 84:31-38(1989) . 

RN C23 

RP SEQUENCE FROM N.A. 

RC STRAIN=8882; 

RM 90222135 

RA KATO J.r HISRA T.K., CHAKRABARTY A.M.; 

RL PROC. NATL. ACAD. SCI. U.S.A. 87:2887-2891(1990). 

RN [33 

RP SEQUENCE FROM N.A. 

RC STRAIN=8830! 

RM 91008921 

RA DERETIC V., KONYECSNI U.M.; 

RL J. BACTERIOL. 172;5544-5554 ( 1990) . 

RN C43 

RP SEQUENCE FROM N.A. 

RC STRAIN=PAO / PA02003; 

RM 90236911 

RA KONYECSNI W.M., DERETIC V.; 

R C L C ^TuISlSi; TH£ ! PROMOTER^FOR^A^ CRIT ICAL ALGINATE BIOSYNTHETIC 

CC ' GENE t ALGD r ENCODING GDP-MANNOSE DEHYDROGENASE , IS ACTIVATED ONLY 

CC UNDER CONDITIONS REMINISCENT OF THE CYSTIC FIBROSIS LUNG (I.E., 

CC UNDER HIGH QSMOLARI TY) , AND AT LEAST TWO REGULATORY GENES, ALGP 

CC AND ALGQ, HAVE BEEN IMPLICATED IN THIS ACTIVATION PROCESS. 

« DISEASE J ALGINATE IS AN EXOPOLYSACCHARIDE PRODUCED BY STRAINS 

CC ' OF P^ AERUGINOSA DURING INFECTION IN THE RESPIRATORY TRACT OF 

S S C XY F -rMINL P BlNDS T ?0 DNA. IT IS UNKNOWN WHETHER BINDING IS 

CC SPECIFIC OR NON-SPECIFIC. 

CC -!- SIMILARITY: TO EUKARYOTIC HISTONES HI. 

DR EMBL', M30145; PAARGRA. 

DR EMBL; M57551; PAALGP. 

DR EMBL; M32077; PAALALPQ. 

DR EMBL; M35259; PAALGR3A. 











BR 


PIRi A35630J 


A3563G 




BR 


pir; A36128; 


A36128 




KtJ 


ALGINATE BIOSYNTHESIS J 


KW 


dna-binding; 


REPEAT. 


r i 


CONFLICT 


28 


28 


FT 

r i 


CONFLICT 


157 


158 


FT 

r t 


CONFLICT 


157 


158 


FT 


CONFLICT 


173 


173 


FT 


CONFLICT 


177 


177 


FT 

r i 


CONFLICT 


184 


184 


FT 

r t 


CONFLICT 


219 


220 


FT 

r i 


CONFLICT 


242 


242 


FT 


CONFLICT 


262 


263 


FT 

r t 


CONFLICT 


268 


268 


FT 

r i 


rnwFLlCT 


278 


279 


FT 


CONFLICT 


299 


299 


FT 

r i 


CONFLICT 


308 


309 


FT 


CONFLICT 


316 


316 


SG 


SEQUENCE 


340 aa; 


33 


Initial Score 




9 0 


Residue Identity 


337. M 


Gaps 






0 C 



G -> D (IN REF. 4). 
MA -> KR (IN REF. 3). 
NA -> KP (IN REF. 4). 
K -> KPATK (IN REF. 4). 
A -> G (IN REF. 3 AND 4). 
A -> T (IN REF. 4) . 
NA -> KP (IN REF. 3 AND 4) . 
A -> T (IN REF. 4). 
HV -> PA (IN REF. 4) . 
A -> AKPVAKSAA (IN REF. 4) . 
NA -> KP (IN REF. 3 AND 4). 
A -> T (IN REF. 4) . 
NA -> KP (IN REF. 3 AND 4) . 
p -> T (IN REF. 2) . 
MW; 440087 CN; 

nized Score = 9 Significance = 4.58 

he5 =9 Mismatches = IB 

Conservative Substitutions = 0 



10 



20 



30 



40 



50 



60 



70 



X 10 20 X 

KRDVDLFLTGTPDEYVE6VAQYKALPV 



RETISDLEEALDTLKAR8ADTRTYIVGLKRDVQESLKLAQGVGKVKEAAGKALESRKAKPATKPAAKAAAKP 
30 90 100 X 110 120 130 X MU 



AVKTVAANAAAKPAAKPAAKPAAKT 



AAAKPAAKPAAKPAAKPAAKPAAKTAAAKPAAKPAAKPVAKPAANAA 



150 



160 



170 



180 



190 



200 



210 



220 



AKTAAAKPAAK 
230 



l3 ' CHLI^EUGGR^^ PRQEAELe' MAGNESIUM-CHELATASE SUBUNIT. 

ID CHLI.EUGGR STANDARD ! PRTJ 348 AA. 

AC P31205; 

DT 01-JUL-1993 (REL. 26. CREATED) 

DT 01-JUL-1993 (REL. 26, LAST SEQUENCE UPDATE) 

DT 01-JUL-1993 (REL . 26, LAST ANNOTATION UPDATE) 

DE PROBABLE MAGNESIUM-CHELATASE SUBUNIT. 

GN CHLI OR CCSA. 

OS EUGLENA GRACILIS. 

QP CHLOROPLAST 

qq eukaryota; planta; phycophyta; euglenophyta. 

RN EH 

RP SEQUENCE FROM N.A. 

RC STRAIN=ZJ 

RM 92299087 PT1|T _ c . 

RA ORSAT B. » MONFORT A.r CHATELLARD P.. STUTZ E., 

E ^cn.S^^^il'.'w'SUwLMT PIt «E»T B,OSV»THES,S (PROBABLE) 

DR EMBLJ Z11874! CHEGZ. 

DR EMBL5 X65484! EGCCSA. 

DR PIR', S213835 S21383. 



KU PHOTOS YMTHES I st"*CHLOROPHYLL BIOSYNTHESIS; CHLOROPLAST. 
SO SEQUENCE 348 AA » 39307 MWi 615982 CN} 

initial Score = 9 Optimized Score = 9 Significance = 4.58 

Residue Identity = 33'/. Matches = 9 HiM.tch.i - 18 

Q aps = 0 Conservative Substitutions 

MNKKTNERPvFPFTSIVGSEEMKLSLILNVIDPKIGGVMIMGDRGTGKSTIVRALVDLLPPIDVIENDPYNS 
10 20 30 40 50 60 70 

X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 

DPYDTELMSODVLEKIKKNEKVSI IQVKTPMVDLPLGGTEDRVCGTIDIEKAISEGKKAFEPGLLAQANRGI 
80 90 100 110 120 130 140 

LYVDEVNLLDDHLVDVLLDSAASGWNTVEREGVSICHPARFILVGSGNPEEGELRPQLLDRFGMHAQIKTLK 
150 160 170 180 190 200 210 

EPALRVKl VQQ 
220 



14. US-08-300-510-1 (1-27) 

PRTZ HORVU PROTEIN Z (Z4) (MAJOR ENDOSPERM ALBUMIN) 



ID 

AC 

DT 

DT 

DT 

DE 

GN 

OS 

OC 

OC 

RN 

RP 

RC 

RM 

RA 

RL 

RN 

RP 

RC 

RA 

RL 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

DR 

DR 

DR 

DR 

DR 



PRTZ.HORVU 
P06293! 
Ql-JAN-1988 
01-N0V-1991 
01-MAR-1992 



STANDARD; 



PRT; 399 AA. 



(REL. 06, CREATED) 
(REL. 20, LAST SEQUENCE UPDATE) 
(REL. 21- LAST ANNOTATION UPDATE) 



<Z4) (MAJOR ENDOSPERM ALBUMIN! 



angiospermae; monocotyledoneae; 



PROTEIN Z 
PAZ1. 

hordeum vulgare (barley), 
eukaryota; planta; embryophyta; 
cyperales; gramineae. 

1 1 3 

SEQUENCE FROM N.A., AND PARTIAL SEQUENCE. 
STRAIN=CV. CARLSBERG 1 1 ? TISSUE=GRAIN; 
91099324 

BRANDT A., SVENDSEN I., HE JGAARD J.; 
EUR. J. BIOCHEM. 194:499-505(1990). 
C21 

SEQUENCE OF 220-399 FROM N.A.. AND PARTIAL SEQUENCE. 
STRAIN=CV. CARLSBERG II; TISSUE=GRAIN ; 
HEJGAARD J., RASMUSSEN S.K., BRANDT A.r SVENDSEN I.! 
FFRCJ I FTT 1 SO * 89~94 ( 1985 ) 
■ FUNCTION: A MAJOR COMPONENT OF THE ENDOSPERM ALBUMIN, THIS PROTEIN 
ACTS AS A STORAGE PROTEIN DURING GRAIN FILLING, CONTRIBUTING A 
SUBSTANTIAL PART OF THE GRAIN'S LYSINE. 

TISSUE SPECIFICITY: IS ACCUMULATED AND STORED IN THE ENDOSPERM, 
WHERE IT EXISTS IN A FREE AND A BOUND FORM. 

DEVELOPMENTAL STAGE: SYNTHESIZED 10-25 DAYS AFTER FERTILIZATION 
(DEVELOPING ENDOSPERM). 

INDUCTION: ITS EXPRESSION IS REGULATED BY THE "HIGH LYSINE 

SIHILARITJfwnS SERPINS. THIS SUGGESTS THAT THIS PROTEIN ALSO HAS 
INHIBITORY FUNCTION DURING FILLING OR GERMINATION. 

PROTEINS: Z4 (FROM CHROMOSOME 4) AND Z7 



AN 



THERE SEEM TO BE TWO 
(FROM CHROMOSOME 7) . 
EMBL? X51726; HVPAZ1. 
EMBL; X05902; HVPROTZ. 
PIR; A01252; DXBHZ. 
PIR; S13822; S13822. 
PROSITE; PS00284! SERPIN. 



FT DOMAIN 36 56 SIGNAL FOR TARGETING PROTEIN Z4 INTO 

pi THE ER LUMEN (POTENTIAL). 

FT ACT SITE 357 357 REACTIVE BOND (POTENTIAL). 

SQ SEQUENCE 399 AA; 43276 MM; 857901 CN; 

Initial Score = 9 Optimized Score = 9 Significance = 4.58 

Residue Identity = 337. Matches * ' Mlsnatches I " 

Gaps = 0 Conservative Substitutions - « 

ATDVRL5I AH8TRFALRLRSAISSNPERAAGNVAFSPLSLHVALSLITAGAAATRDQLVAILGDGGAGDAKE 
10 20 30 40 50 60 70 

X 10 20 X 

KRDVDLFLTGTPDEYVEQVAQYKALPV 

Ml! I till 

LNALAEQVVQFVLANESSTGGPRI AFANGIFVDASLSLKPSFEELAVCQYKAKTQSVDFQHKTLEAVGQVNS 

80 90 100 X 110 120 130 X 140 

WVEQVTTGLIKQILPPGSVDNTTKLILGNALYFKGAWDQKFDESNTKCDSFHLLDGSSIQTQFMSSTKKQYI 
150 160 170 180 190 200 210 220 

SSSDNLKVLKL 
230 



15 ' PRE^STAAU 5i °PLASMID 7 REC0MBINATI0N ENZYME (MOBILIZATION PROTEIN 



ID 



PRE2.STAAU STANDARD; PRT; 420 AA. 

AC P22490; 

DT 01-AUG-1991 (REL. 19, CREATED) 

DT 01-AUG-1991 (REL. 19, LAST SEQUENCE UPDATE) 

DT 01-JUN-1994 (REL. 29, LAST ANNOTATION UPDATE) 

DE PLASMID RECOMBINATION ENZYME (MOBILIZATION PROTEIN). 

GN PRE OR MOB. 

OS STAPHYLOCOCCUS AUREUS. 

OG PLASMID PUB110. 

oc prokaryota; firmicutes; COCCI? MICROCOCCACEAE. 

RN CI] 

RP SEQUENCE FROM N.A. 

RA BASHKIROV V.I., MIL'SHINA N.V., PROZOROV A. A.; 

R C L C S ET F X^ OF THE RSA SITE AND THE PRE PROTEIN 

CC ' MAY NOT ONLY SERVES A FUNCTION IN PLASMID MAINTENANCE, BUT ALSO 

CC MAY CONTRIBUTES TO THE DISTRIBUTION OF SMALL ANTIBIOTIC RESISTANCE 

CC PLASMIDS AMONG GRAM-POSITIVE BACTERIA. „,,„«,„ DMU1 « 

CC -'- SIMILARITY; TO OTHER PRE PROTEINS (FROM PLASMIDS PUB110, PMV158, 

CC ' PE194, PT181, PTB913), IN THEIR N-TERMINAL ONLY. 

-<- PRE PROTEINS CONTAIN CONSERVED POSITIVELY CHARGED AMINO ACIDS 
PROBABLY INVOLVED IN THE BINDING OF THE PRE PROTEIN TO THE RSA 

CC SITE. 

DR EMBL; M37273; PPKANRCG. 

KW PLASMID; DNA-BINDING . 

FT BINDING 44 44 DNA (POTENTIAL). 

FT BINDING 114 114 DNA (POTENTIAL). 



CC 
CC 



SQ SEQUENCE 420 AA; 49660 MM J 900437 CN; 

Initial Score = 9 Optinized Score = 9 Significance = 4.58 

Residue Identity = 337. Matches = 9 Mismatches - 18 

Gap5 = 0 Conservative Substitutions - u 

MFGLGKEIMKTEKKPTKNVVISERDYKNLVTAARDNDRLKQHVRNLMSTDMAREYKKLSKEHGQVKEKYSGL 
250 260 270 280 290 300 310 



X 10 20 X 



1 kkuvblr 1 1 b i i »utT vtttvHti » nnvr » 



VERFHENVNDVNELLEENKSLKSK(SDLKRDVSLIVE5TKEFLKERTDGLKAFKMVFKGPVDKVK0KTASF9 

320 330 340 X 350 360 

EKHDLEPKKNEFELTHNREVKKERSRDQGMSL 
inn 410 420 



390 . 400 410 

> 0 < 

0| |0 IntelliGenetics 

> 0 < 



FastDB - Fast Pairuise Comparison of Sequences 
Release 5.4 

Results file 2-pir.res *ade by on Fri 24 Mar 95 7I48.03-F8T. 



Query sequence being conpared^S-08-300-510-2^ l-27> 
Nunb«r of sequences searched- 
Number of scores above cutoff*. 

ResultS of Ih. i.lll.l comparison of U S-08-300-5iO-2 (.-27, .ith. 



Data'bank : PIR 43, all entries 
100000 



H 

U50Q00- 

ii 

B 

E " * 

R 

0 

F1000Q- 
S 

E 5000- 

S 

U 

E * 

N 

C 

E 

S 1000- 



500- 



* * 



100- * 



50- 



10- 



5- 



SCORE 0| 
STDEV -1 



I fl 
1 3| 
0 1 



II 
2 



9 I 
5 



12 I 1 1* I 
7 8 9 



17 



I 

20 



23 



I 

26 



PARAMETERS 



Similarity matrix 
Mismatch penalty 
Gap penalty 
Gap size penalty 
Cutoff score 
Randomization group 



Unitary 
1 

1.00 
0.05 
0 
0 



Initial scores to save 
Optimized scores to save 



Scores ; 



40 
0 



K-tuple 

Joining penalty 
Window size 



Alignments to save 
Display context 



2 
20 
27 



15 
100 



SEARCH STATISTICS 



Mean 
3 



Median 



Tines : 



CPU 
00:00:59.05 



Standard Deviation 
1.42 

Total Elapsed 
00; 01 :00.00 



Number of residues: 

Number of sequences searched: 

Number of scores above cutoff: 

Cut-off raised to 3. 
Cut-off raised to 4. 
Cut-off raised to 5. 
Cut-off raised to 6. 
Cut-off raised to 7. 



22468834 
75511 
4500 



The scores below are sorted by initial score. 
Significance is calculated based on initial score. 

A 1007. identical sequence to the query sequence was not found. 



The list of best scores is 



Sequence Name Description 



I nit. Opt. 
Length Score Score Sig. Frame 



1. JC1126 

2. JC1136 

3. GNNY2F 

4. A53283 

5. S37077 

6. S40064 

7. S28562 



**** 16 standard deviations above mean 
major allergen chain 1 precur 88 
major allergen chain I precur 92 

*#** 7 standard deviations above mean 
genome polyprotein - foot-and 2333 

»#** 6 standard deviations above mean 
major cat allergen Fel d I al 40 
genome polyprotein - foot-and 2336 

»##* 5 standard deviations above mean 
3-deo:<y-D-manno-2-octulosonic 411 
3-deoxy-D-nanno-2-octulosonic 



411 



#*** 

26 

26 

13 

#»*# 

12 

12 

«*#» 

11 

11 



26 
26 



12 
13 

11 
11 



16.17 
16.17 



14 7.03 



6.33 
6.33 

5.62 
5.62 



0 
0 



0 
0 

0 
0 







7 » 


GNNYF 


10. 


SQ2G68 


11. 


S10340 


12. 


S00964 


13. 


GNNY4F 


14. 


C30305 


15. 


A35072 


16. 


B35072 


17. 


HRTHfl 


18. 


S45108 


19. 


529037 


20. 


S12620 


21. 


S24369 


22. 


CFXCA 


23. 


S23323 


24. 


S15274 


25. 


S44182 


26. 


S24232 


27. 


S46030 


28. 


KIUTGC 


29. 


S19722 


30. 


S42206 


31. 


DEBY4 


32. 


S42306 


33. 


A60280 


34. 


A24031 


35. 


, VHIVC8 


36. 


, S42304 


37. 


. S07508 


38, 


. YDBPA7 


39, 


. C25035 


40 


. D25035 



genome polyprotein - foot-and 

4 standard deviations ab 
RNA-directed RMA polymerase ( 
DNA-directed RNA polymerase ( 
hypothetical protein 6 - yeas 
genome polyprotein - foot-and 
submandibular gland protein ( 
nonhistone chromosomal protei 
nonhistone chromosomal protei 
Fiyohemerythrin - sipunculid ( 
hypothetical protein 2 - Erui 
Na+-transporting ATP synthase 
Na+-transporting ATP synthase 
Na+-transporting ATP synthase 
C-phycoerythrin alpha chain - 
Ma+-transporting ATP synthase 
cutR protein - Streptomyces I 
Phi p I allergen - Common tim 
np I protein - Listeria monocy 
hypothetical membrane protein 
phosphoglycerate kinase (EC 2 
dihydrolipoamide acetyltransf 
enolase (EC 4. 2. 1-11) - Plasm 
alcohol dehydrogenase (EC 1.1 
gene 4B protein - phage T7 
bacillolysin homolog (EC 3-4. 
genome polyprotein - foot-and 
nucleoprotein - influenza C v 
gene 4A protein - phage T7 
DNA primase - phage T3 
DNA primase chain A - phage T 
colicin la - Escherichia coli 
colicin lb - Escherichia coli 
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ic 
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>ve mean 
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470 


10 


11 


4.92 


0 


982 


10 


10 


4.92 


0 


982 


10 


10 


4.92 


0 


2332 


10 


11 


4.92 


0 


91 


9 


10 


4.22 


0 


93 


9 


9 


4.22 


0 


99 


9 


9 


4.22 


0 


118 


9 


11 


4.22 


0 


151 


9 


9 


4.22 


0 


163 


9 


It 


4.22 


0 


163 


9 


11 


4.22 


0 


163 


9 


11 


4.22 


0 


164 


9 


10 


4.22 


0 


168 


9 


11 


4.22 


0 


217 


9 


9 


4.22 


0 


263 


9 


9 


4.22 


0 


271 


9 


10 


4.22 


0 


347 


9 


9 


4.22 


0 


421 


9 


11 


4.22 


0 
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4.22 


0 


446 
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4.22 
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510 
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529 


9 


10 


4.22 


0 


565 


9 


9 


4.22 


0 


566 


9 


10 


4.22 


0 


566 


9 


10 


4.22 


0 


566 


9 


10 


4.22 


0 


626 


9 


9 


4.22 


0 


626 


9 


9 


4.22 


0 



I, US-08-300-510-2 (1-27) 

JC1126 major allergen chain 1 precursor B 



cat 



ENTRY 
TITLE 
ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
^authors 

§ journal 
ttitle 

taccess ion 
Mmolecule 
8#residues 
GENETICS 
8gene 
8introns 
FEATURE 
1-18 
19-88 

SUHMARY 
SEQUENCE 

Initial Score 
Residue Identity 



JC1126 (Hype complete 

Piajor allergen chain 1 precursor B - cat 

formal name Felis silvestris catus tkommon.name domestic cat 
31-Dec-1993 #sequence.revision 31-Dec-1993 #text.change 

31-Dec-1993 
JC1126 

Griffith. I.J.J Craig, S.J Pollock, J.J Yu, X.B.r 

Horgensterro J. P. J Rogers* B.L. 
Gene (1992) 113?263-268 

Expression and genomic structure of the genes encoding Fdl, 

the major allergen from the domestic cat. 
JC1126 
type DNA 

1-88 Mlabel GR1 

Chi 

17/i? 79/3 

•doflain signal sequence status predicted *\f b f l 
product major allergen chain 1 status predicted #label 

HAT , A _ „ 

length 88 Smolecular-aeight 9586 Checksum 4095 



26 Optimized Score 
96X Hatches 



26 Significance = 16.17 
26 Mismatches - 1 



Taps" 



u conservative auos Lieut ions 



X 10 20 

KALPVVLENARILKNCVDAKMTEEDK 

IIIIMM iMIIMIIMUMM 
HLDAALPPCPTVAATADCEICPAVKRDVDLFLTGTPDEYVE8VA8YNALPVVLEMARILKNCVDAKMTEEDK 

l0 20 30 40 X 50 60 70 

X 
E 

i 

ENALSVLDK I YTSPLC 
X 80 



2. US-08-300-510-2 (1-27) 
JC1136 ~- ! — 



-2 (1-27) 

major allergen chain i precursor A - cat 



ENTRY 
TITLE 
ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
^authors 

#journal 
Hitle 

^accession 
ft&nolecule 
*8residues 
GENETICS 
ftgene 
ftintrons 
FEATURE 
1-22 
23-92 

SUMMARY 
SEQUENCE 

Initial Score 
Residue Identity 
Gaps 



JC1136 #type complete 

major allergen chain 1 precursor A - cat 

•fornal name Felis silvestris catus tcommon.name domestic cat 
3l-Dec-1993 #sequence_revi s ion 31-Dec-1993 *text_change 

31-Dec-1993 
JC1136 
JC1126 

Griffith, I. J. ; Craig, S.I Pollock, J.', Yu. X.B.? 

Morgenstern, J.P.", Rogers, B.L. 
Gene (1992) 113:263-268 

Expression and genomic structure of the genes encoding Fdl, 

the major allergen from the domestic cat. 
JC1136 
type DNA 

1-92 ftftlabel GRI 

Chi 

21/1? 83/3 

(tdomain signal sequence #status predicted ftlabel SIG\ 
^product major allergen chain 1 ftstatus predicted ilabel 
MAT 

#length 92 #molecul ar-ueight 10072 #checksum 4988 

26 Optimized Score = 26 Significance = 16.17 
967. Matches = 26 Mismatches = 1 

= 0 Conservative Substitutions = 0 



X 10 20 

KALPVVLENARILKNCVDAKMT 

Ml II Mill II I Ml IN M 
MKGACVLVLLWAALLLISGGNCEICPAVKRDVDLFLTGTPDEYVEQVAQYNALPVVLENARILKNCVDAKMT 

40 50 60 70 



10 



20 



30 



X 

EEDKE 



EEDKENALSVLDK I YTSPLC 
X 80 90 



3. US-08-300-510-2 (1-27) 

GNNY2F genome polyprotein - foot-and-mouth disease virus 

ENTRY GNNY2F itype complete 

TITLE genome polyprotein - foot-and-mouth disease virus A (strain 

AC 10 361 ) 



ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
•authors 
•journal 
•title 



vri i LUdt , u.ou-i n vr** coau protein vro. coat 
" o 4 «re protein p32> genome-linked protein VPgl", 

S«om-1 inked protein VPg2 ? genome-linked protein VPg3, 
nonstructural protein p20a; nonstructural protein P 20b, 
RNA-dir«ted RNA polynerase (EC 2.7.7.4BJ 
•fS^-lln;— Aphthovirus A •common.name foot-and-mouth 

17?i""i;aJ lr tI.qi.nc..revi.ion 2B-Aug-1985 •text.change 

31-Dec-1993 
A93508! A91491J S30753 

A93508 „, , _ e 

Carroll, A.R.! Rowlands, D.J. J Clarke, B.E. 

primary translation product of foot and mouth disease 
virus . 

•cross-references MUID '.84169547 
•accession A93508 

itnolecule.type genomic RNA 

••residues 1-2333 fllabel CAR 

••cross-references GB'.X00429 

"tnors foofhroyd, U.C.. Harris, T.J.R.. Ro„l.nd.. D.J.. Uue, P.A. 

; j «tU 1 TSr««T«liir«^i«« of cDNA coding for the structural 

proteins of foot-and-mouth disease virus, 
•cross-references MUIDJ82211814 
•accession A91491 

r«":l!^:- tUPe ?EmL™3.7-»1.-L- .433-104. .Hat*. BOO 

^^cross-references GB5V01130 

synthesis at tuo separate AUGs. 

•accession S30753 

••molecule type genomic RNA 
••residues" 1-32 8*label SAN 

• •cross-references EMBUM31575 MtBa , a , a vir >u« aenome poluprotein 



KEYWORDS 



FEATURE 

1-204 

205-286 

287-504 

505-725 

726-937 

938-1578 

1579-1601 

1602-1625 

1626-1649 

1650-1863 

1864-2333 
SUMMARY 
SEQUENCE 

Initial Score 
Residue Identity = 
Gaps 



•product nonstructural protein p20a *label NPA\ 
•product coat protein VP4 ilabel VP4\ 
•product coat protein VP2 «label VP2\ 
•product coat protein VP3 ilabel VP3\ 
•product coat protein VP1 ftlabel VP1\ 
•product core protein p52 * labe i l J* P ^ 1 . . CL1X 
•product genome-linked protein VPgl abe uLl\ 
•product genome-linked protein VPg2 * abe GL2\ 
•product genome-linked protein VPg3 •abe GL3N 
•product nonstructural protein p20b lla el NPB\ 
•product RNA-directed RNA polymerase Jlabel RRP 
•length 2333 fcmolecular-ueight 259646 Checksum 7155 

13 Optimized Score = 14 Significance = 7.03 
41X Hatches = 15 Mismatches - 12 

9 Conservative Substitutions 



l8 „f" vK "iro GVCCGs r8fo DOAOTF !ero HSAGC iMr sc " 8R f8=r BKAHv ra6r" ESL "55o o 



10 



20 



,V m i mi mm 

V EER VH,» R «TKU P TV»V 0 , FN P EF C ! ^ S »K B P- f =VV LB; ,, Fo SKHK S D t K„T EEt KALPR«CAA 0 



1880 



1890 



1900 



1910 



YASRLHSVLGTANAPLSIYEAIKG 



VDGLDAMEPDTAPGLPWALQGKRRGALIDFENGTVGPEVEAALKLMEK 



1950 



1960 



1970 



1980 



1990 



2000 



REYKFACQTFLKDEIRPMEK 
2020 2030 



4. US-08-300-510-2 (1-27) t ha chain - cat (frag 

A53283 major cat allergen t-ei a i o'k 



ENTRY 
TITLE 
ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
Sauthors 

Sjournal 
Hitle 

ftaccession 
tt&status 
##nolecule 
#*residues 

SUMMARY 

SEQUENCE 

Initial Score 
Residue Identity 
Gaps 



A53283 tttype Oagnent _ 

naJ or cat allergen e d P c « M 9 doneslic cat 

#fornal_nane Felis silvesuis ca change 
12-May-1994 8sequence_revision 12-May-1994 #text_cnang 

12-May-1994 
A53283 

A53283 I • Mitt i, G.J Polo. F.J Lonbardero » 

Duffort. O.A.: Carreira. J.' Nitti- b.. 

allergen Felis dotnesticus I. 

A53283 

preliminary 
type protein 

1-40 SSlabel DUF 
#length 40 fcchecksum 3032 



12 Optimized Score = 12 Significance 
toll Hatches = « Hisnatches 

0 Conservative Substitutions 

X 10 X 20 

KALPVVLENARILKNCVDAKMTEEDKE 



I 



6.33 

0 
0 



EICPAVKRDVDLFLTGTPDEYVEQVAQYKALPVVLENAR 
10 20 30 40 



5 . »^»-4 B i;-. 87 P i lBP „ ulll - ,0*-**-™* -1. «™ 



ENTRY 
TITLE 

CONTAINS 



ORGANISE 
DATE 

ACCESSIONS 
REFERENCE 
Sauthors 



pr „? ein V P „ coro protein Pl« ™£P " p 
nonstructural protein p20 8 i P™'""" i" 3 ''- ' 

disease virus A . .p,, »ta X t change 

31-Dec-1993 Ssequence.revision 31-Dec-1993 Stext.cnang 

31-Dec-1993 
S37077 

losTovtsev, S.V.J Qnischenko, A.M.J Petrov. M.A.I 
KaL.hnikova. T.I.J Manaeva. N.V.J Drygin, V.Y.. 
Perevozchikova, N.A.J Vasilenko. S.K. 



ftaccess ion 



"liuol-it Ueb 16 tnu dies 
S37077 

ftftmolecule type genomic RNA 
ftftresidues 1-2336 ftftlabel SOS 

ftftcross-ref erences EFIBL J X74812 . 
CLASSIFICATION ftsuperfamily foot-and-mouth disease virus genome polyprotein 
KEYWORDS coat protein! core protein; genome-linked protein; 

KEYWORDS nonstructural protein; nucleotidyltransferase; polyprotein 



FEATURE 
1-217 

218-236 

287-504 

505-724 

725-938 

939-954 

955-1108 

1109-1426 

1427-1579 

1580-1602 

1603-1626 

1627-1650 

1651-1863 
1864-2333 

SUMMARY 
SEQUENCE 

Initial Score 
Residue Identity 



ftproduct nonstructural protein p20a ftstatus predicted 
ftlabel NPA\ 

product coat protein VP4 ftstatus predicted ft abe VP4\ 
ftproduct coat protein VP2 ftstatus predicted ftlabel VP2\ 
{{product coat protein VP3 ftstatus predicted ftlabel VP3\ 
({product coat protein VP1 ftstatus predicted ftlabel VP1\ 
ftproduct core protein X Sstatus predicted ftlabel CPX\ 
ftproduct core protein pl4 ftstatus predicted ftlabel C14\ 
ftproduct core protein p41 ftstatus predicted tlabel C41\ 
ftproduct core protein pl9 ftstatus predicted ftlabel C19\ 
ftproduct genome-linked protein VPgl ftstatus predicted 
ftlabel VG1\ 

• product genome-l inked protein VPg2 ftstatus predicted 

tlabel VG2\ t . 

ftproduct genome-linked protein VPg3 ftstatus predicted 

ftlabel VG3\ 

ftproduct proteinase ftstatus predicted ftlabel PTS\ 
ftproduct RNA-directed RNA polymerase ftstatus predicted 

ftlabel RRP , 
ftlength 2336 ftmolecular-ueight 259983 ftchecksum 4399 



12 Optimized Score 
387. Matches 



13 Significance = 6.33 

14 Mismatches = 13 



Gaps 



9 Conservative Substitutions 



I Op 3 

GLFAYKAATKAGYCGGAVLAKDGADTF IVGTHSAGGNGVGYCSCVSRSMLLKMKAHIDPEPHHEGLIVDTRD 
1800 1810 1820 1830 1840 1850 I860 1870 

x 10 20 X 

KAL PVVLENARILKNCVDAKMTEEDKE 

|| III I I 1 t f 1 I I t 

VEERVHVMRKTKLAPTVAHGVFNPEFGPAALSNKDPRLNEGVVLDEVIFSKHKGDTKMTEEDKALFRRCAAD 

1900 1910 1920 1930 X 1940 



1880 



1890 



YASRLHNVLGTANAPLSI YEAIKGVDGLDAMEPDTAPGLPWALGGKRRGTL I DFENGTVGPEVASALELMEK 

1970 1980 1990 .2000 2010 



1950 



1960 



RQYKFTCQTFLKDEVRPMEK 
2020 2030 



6. US-08-300-510-2 (1-27) 

S40064 3-deoxy-D-manno-2-octulosonic acid (Kdo) transfera 



ENTRY 
TITLE 

ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
ftauthors 
ftjournal 
fttitle 



19-May-1994; fttext.change 



S40064 fttype complete 

3-deoxy-D-manno-2-octulosonic acid (Kdo) transferase - 

Chlamydia psittaci 
ftf ormal_nap.e Chlamydia psittaci 
19-May-1994; ftsequence_rev is ion 

19-May-1994 
S40064 
S40064 

Mamatr U.; Baumann. M. ; Schmidt. G. ; Brade, H. 
Hoi. Microbiol. (1993) 10:935-941 

The genus-specific lipopolysaccharide epitope of Chlamydia 
assembled in C. psittaci and C. trachomatis by 



giycosyitranst erases ot ion noino.ogy. 



^accession S40064 

••status preliminary 
••residues 1-411 Wlabel MAM 

••cross-references EMBL*.X69476 «checksun 5839 

SUMMARY #length 411 #n.olecular-ueight 46618 SchecKsun 

SEQUENCE 

«••""• I '"J Substation, ■ ° 

Gaps 

140 150 



ftVIINGKLSANSCKRFTILKRFGRNYFSPVDGFLLflDESHKARFLSLGVDKEKIflVTGNIKTYTETLSENNS 



X 10 20 X 

KALPVVLENARILKNCVDAKMTEEBKE 

... i it i i t 



AQDTELLVLGSVHPKDVEVWLPVVRELRRNLKVLUVPRHIE^ 



RDYWREKLQL 
210 220 230 



GAGCCLDKTNI 
360 



,. ^°»- 5 '»- 3 ? d ^Ji;.„ a „ no . 2 . octulo5 on l c acid <Kd ol transFera 

X ^U.1»-.t!r*-«^««t' a.id <Kd.> tr„.f - 

Chlamydia psittati 

18-Jun-1993 
ACCESSIONS S28562 

"'ESSr. Bau»a„«. K.. Scn-idt. ... Br.-.. H. 

Sautnors nan Library* November 1992 

ttdescripLi invQlved in the expression of the genus-specif ic 

lipopolysaccharide epitope, 
^accession S28562 
##nolecule type DNA 
ffresidues 1-411 Mlabel MAM 

##cross-ref erences EMBL?X69476 
GENETICS 

SUMMARY* S"nflth 411 Molecular-eight 46618 tchecksu. 5839 

SEQUENCE 

— • "« ■ x rr^ store = » ' : 5 '" 
j"'"- ide " ui,f : ,0 sub S tit U tio„ 5 • ° 

uaps 

AVnNGKLSAHSCKRFTlLKRFGRNYFSPVDGFLLSDESHKARFLQLGVDKEKieVTGNIKTY^^ 

X 10 20 X 

KALPVVLENARILKNCVDAKMTEEDKE 

RDYUREKL8tA6DTELLVLGSVHPKDVEVWLPVVRELRRNLKVtHVPRHIER^ 



tH i t HisnuH 1 1 v 



290 300 310 320 330 340 350 



GAGCCLDKTNI 
360 



8. US-08-300-510-2 (1-27) 

JN0431 RNA-directed RNA polymerase (EC 2.7.7.48) - toot-a 



ENTRY 
TITLE 

ORGANISM 

DATE 



.48) - foot-and-mouth 



f oot-and-mouth 



ACCESSIONS 
REFERENCE 
^authors 



JN0431 Hype complete 
RNA-directed RNA polymerase (EC 2,7,7, 

disease virus A (strain A22) 
ftformal.name Aphthovirus A &comnon_name 

disease virus A 
05-Mar-1993 #sequence_revision 05-Mar-1993 Hext.change 

30-Sep-1993 
JN0431 



Khim. 



JN0431 
Kuzninr 
A.N. 

^journal Bioorg. 
Hitle Nucleotide sequence 

•cross-references MUID s 89302183 
•accession JN0431 

##molecule_type mRNA 

^residues 1-470 ##label 



I.V.J Rybakovr S.S.J Ivanyushchenkov. V.N.; Burdov, 



(1989) 155419-422 
of the FMDV 



KUZ 



A22 RNA polymerase gene, 



##note 
CLASSIFICATION 
KEYWORDS 
SUMMARY 
SEQUENCE 



this paper is in Russianr with an English abstract 
Ssuperfamily foot-and-mouth disease virus genome polyprotein 
nucleotidyltransferase 

Slength 470 imolecular-ueight 52657 ^checksum 1182 



Initial Score 
Residue Identity = 
Gaps - 



11 Optimised Score = 12 
367. Matches = 13 

9 Conservative Substitutions 



Significance = 5.62 
Mismatches = 14 

0 



x 10 20 X 

KAL PVVLENARILKNCVDAKMTEEDKE 

li Ml I 1 IN ID 

GLIVDTRDVEERVHVMRKTKLAPTVAHGVFNPEFGPAALFNKDPRLNEGVVLDEVIFSKHKGDTKMTAEDKA 

10 20 30 X 40 50 60 70 X 



LFRACAADYASRLHNVLGTANAPLSI YEAIKGIDGLDAMEPDTAPGLPHALQGQRRGALIDFENGTVGPEVA 

110 120 130 140 



80 



90 



100 



SALELMEKRQYKFTCGTFLKDEVRPMEK 
150 160 170 



9. US-08-300-510-2 (1-27) 

GNNYF genome polyprotein 



foot-and-mouth disease virus 



ENTRY 
TITLE 

CONTAINS 



ORGANISM 

#note 
DATE 

ACCESSIONS 



GNNYF Hype complete „ , * 

genome polyprotein - foot-and-mouth disease virus 0 (strains 

01K and 01BFS) . 
coat protein VP1? coat protein VP2? coat protein VP3; coat 

protein VP4? core protein pl2i core protein pl4; core 

protein P20b; core protein p34? core protein P56; core 

protein VPg; nonstructural protein p20a 
#for«al_nane Aphthovirus 0 #common„name foot-and-mouth 

disease virus 0 
host Artiodactyla (cloven-footed mammals) 
Ol-Sep-1981 tsequence.revision 27-Nov-1985 Hext.change 

08-Apr-1994 
A03907; A37503 



^authors Forss. S.? Strebel, K.J Beck, E.J Schaller, 

# journal Nucleic Acids Res. (1984) 12:6587-6601 

•title Nucleotide sequence and genone organization of foot-and-mouth 

disease virus, 
f cross-references HUID '-84297249 
^contents strain OIK 
•accession A03907 

••nolecule type mRNA 

ttSresidues 1-2332 i&label FOR 

REF |authors Hakoff, A.J. 5 Paynter, C.A.J Rowlands, D.J. J Boothroydr J.C. 

•journal Nucleic Acids Res. (1982) 10:8285-8295 

•title Comparison of the amino acid sequence of the major immunogen 

from three serotypes of foot and mouth disease virus. 
#cross-ref erences MUID:83143292 
•contents strain 01BFS 
^accession A37503 

"^H~ e SS!?}l.^.7..-M7...-.«).- M ..-.-.«a-»51 «L».l ««K 

COMMENT The coat protein VP1 contains the main antigenic determinants of 
the virion; thereforer changes in its sequence must be 
responsible for the high antigenic variability of the virus. 

COMMENT Coat proteins VP2 and VP3 are related to the poliovirus coat 

proteins VP2 and VP3. . 

CLASSIFICATION Ssuperfamily foot-and-mouth disease virus genone polyprotein 

KEYWORDS coat protein? core protein? nonstructural protein! 

polyprotein 

FEA 1-217 Sproduct nonstructural protein p20a Slabel NPA\ 

218-286 Sproduct coat protein VP4 Slabel VP4\ 

287-504 Sproduct coat protein VPS Slabel VP2\ 

505-724 Sproduct coat protein VP3 Slabel VP3\ 

725-937 Sproduct coat protein VP1 Slabel VP1\ 

938-1107 Sproduct core protein pl2 Slabel C12\ 

1108-1425 Sproduct core protein p34 Slabel P34\ 

1426-1578 Sproduct core protein pl4 Slabel C14\ 

1579-1649 Sproduct genome-linked protein VPg ftlabel VPG\ 

1650-1862 Sproduct nonstructural protein p20b Slabel P20\ 

1863-2332 Sproduct RNA-directed RNA polymerase Slabel P56 

SUMMARY Slength 2332 Smolecular-ueight 258925 Schecksum 4170 

SEQUENCE 

initial Score = U Optimized Score = 12 Significance = 5.62 
Residue Identity = 36% Matches = 13 Mismatches - 14 

Gaps = 9 Conservative Substitutions - « 

GLFAYRAATKAGYCGGAVLAKDGADTFI VGTHSAGGNGVGYCSCVSRSMLLKHKAHIDPEPHHEGLI VDTRD 
1800 1810 1820 1830 1840 1850 I860 1870 

X 10 20 X 

KAL- PVVLENARILKNCVDAKMTEEDKE 



VEERVHVMRKTKLAPTVAHGVFNPEFGPAALSNKDPRLNEGVVLDEVIFSKHKGDTKMSEEDKALFRRCAAD 
1880 1890 1900 1910 1920 1930 X 1940 

YASRLHSVLGTANAPLSI YEAIKGVDGLDAMEPDTAPGLPWALQGKRRGALIDFENGTVGPEVEAALKLMEK 
1950 1960 1970 1980 1990 2000 2010 

REYKFVCQTFLKDEIRPLEK 
2020 2030 



10. US-08-300-510-2 (1-27) 

S02068 RNA-directed RNA polymerase (EC 2.7.7.48) - foot-a 



TITLE RNA-directed RNA polymerase (EC 2.7.7.48) - f oot-and-nouth 

disease virus A 
ALT£RNATE_NAHES RNA repticase 

ORGANISM " if ormal_nane Aphthovirus A #connon_nane f oot-and-nouth 

disease virus A 
Ol-Dec-1989 #sequence_revision Ol-Dec-1989 itext_change 
30-Sep-1993 
ACCESSIONS S02068 
REFERENCE S02068 

•authors Villaverde, A.; Martinez-Salas » E.? Doningo, E. 

•journal J- Hoi. Biol. (1988) 204:771-776 

•title 3D gene of f oot-and-nouth disease virus. Conservation by 

convergence of average sequences, 
•cross-references HUID;89I41768 
•accession S02068 

••*nolecule_type nRNA 

••residues 1-470 ••label VIL 



••note 

••note 
GENETICS 
• gene 
CLASSIFICATION 
KEYWORDS 
SUMMARY 
SEQUENCE 

Initial Score = 
Residue Identity = 
Gaps = 



48-Glyr 68-Alar 158-Valr 274-IUr 306-Ile, 374-Leur and 

444-Glu were also found 
sequence not compared to nucleotide translation 

3D 

•superfanily f oot-and-nouth disease virus genone polyprotein 
nucleotidyltransferase 

•length 470 •rcolecular-ueight 52910 •checksum 502 



10 Optinized Score = 11 Significance 
337. Matches = 12 Mismatches 

9 Conservative Substitutions 



4.92 
15 
0 



X 10 20 X 

KAL FVVLENARILKNCVDAKMTEEDKE 

ii ill i ii nil 

GLIVDTRDVEERVHVHRKTKLAPTVAHGVFMPEFGPAALSHKDPRLMEGVVLDEVIFSRHKCDTKHSEEDKA 
10 20 30 X 40 50 60 70 X 

LFRRCAADYASRLHSVLGTANAPLSI YEAIKGVDGLDAMEPDTAPGLPWALQGKRRGALIDFENGTVGPEAE 
80 90 100 110 120 130 140 



AALKLMEKREYKFACGTFIKDEIRPMEK 
150 160 170 



11. US-08-300-510-2 (1-27) 

S10340 DNA-directed RNA polymerase (EC 2.7.7.6) - yeast 



- yeast 

lactis r Candida 



S10340 #tyP e complete 
DNA-directed RNA polymerase (EC 2.7.7.6) 

(Kluyverotnyces narxianus var. lactis) 
tforwal_nane Kluyveronyces narxianus var, 
sphaerica 

21-Nov-1993r •sequence_revision 21-Nov-1993? •text. change 

21-Nov-1993 
S10340 
S10336 

Wilson, D.W.? Meacock, P. A. 
Nucleic Acids Res. (1988) 16:8097-8112 
Extranuclear gene expression in yeast', evidence for a 
plasnid-encoded RNA polymerase of unique structure, 
•cross-references ilUID: 88335549 
•accession S10340 

••status prel ininary 

••residues 1-982 ••label WIL 

••cross-references EMBLJ X07946 



ENTRY 
TITLE 

ORGANISM 

DATE 

ACCESSIONS 
REFERENCE 
•authors 
•journal 
•title 



^BT" stengtn vs* »H0leuma r-ueignt iuv&u scnecusun o<*i* 

SEQUENCE 

Initial Score = 10 Optimized Score = 10 Significance = 4.92 
Residue Identity = 37X Matches = 10 Mismatches = 17 

Gaps = 0 Conservative Substitutions = u 

DILIGLGAWNTIKEIWSIDRSKIKIDSKTGRINWIRYDKEMEIGQYFKICLSYMRSLGRDILIKNDKYSIVE 
440 650 660 670 680 690 700 

X 10 20 X 

KALPVVLENARILKNCVDAKMTEEDKE 

|| I I III 1 II 

FDNSYLPKTDTMKFGDLDVLDLRI YKGIVMLPLCLRSTYLNKLYVDRKYSEAEKEVTKLLKSKNGAYHTLVE 

710 720 730 X 740 750 760 X 770 

GHRVDRO IRS VI VPDPTLDIDTIK IPFGANIGCE YGLLNRQPSLNVDS I KLVKLKQGSNKT I AINPLLCQSF 
7fl0 " 790 800 810 820 830 840 850 

NADFDGDEMNI 
860 

12. US-08-300-510-2 (1-27) 

S00964 hypothetical protein 6 - yeast (Kluyveronyces narx 

ENTRY S00964 itype conplete 

TITLE hypothetical protein 6 - yeast (Kluyveronyces narxianus var. 

lactis) plasnid pGKl2 , 
ORGANISM ftfornal.nane Kluyveronyces narxianus var. lactis, candioa 

sphaerica 

DATE 30-Sep-1989 #sequence_revision 30-Sep-1989 #text_cnange 

18-Jun-1993 
ACCESSIONS S00964 
REFERENCE S00959 

flauthors Tonnasino, M.J Riccir S.J Galeottir C.L. 

Sjournal Nucleic Acids Res. (1988) 16:5863-5878 

•title Genone organization of the killer plasnid pGKl2 from 

Kluyveronyces lactis. 
#cross-references MUID: 88289339 
Saccession S00964 
##nolecule type DNA 
##residues 1-982 ##label TOM 

ttfccross-references EMBL*.X07776 
GENETICS 

#genone plasnid 
SUMMARY ftlength 982 ftnolecul ar-ueight 113960 tchecksun 5276 

SEQUENCE 

Initial Score = 10 Optinized Score = 10 Significance = 4.92 

Residue Identity = 37X Matches = 10 Misnatches = 17 

Gaps = 0 Conservative Substitutions - u 

DILIGLGAWNTIKEIWSIDRSKIKIDSKTGRINWIRYDKENEIGQYFKICLSYMRSLGRDILIKNDKYSIVE 
640 650 660 670 680 690 700 

X 10 20 X 

KALPVVLENARILKNCVDAKMTEEDKE 

|| I | Ml I II 

FDNSYLPKTDTMKFGDLDVLDLRIYKGIVMLPLCLRSTYLNKLYVDRKYSEAEKEVTKLLKSKNGAYHTLVE 

710 720 730 X 740 750 760 X 770 

GHRVDRC I RSV I VPDPTLDIDTIK I PFGAN I GCEYGLLNRQPSLNVDSIKLVKLKQGSNKT I AINPLLCQSF 
780 790 800 810 820 830 840 850 



NADFDGDEMNI 



oow 



13. US-08-300-510-2 (1-27) 

GNNY4F genotne polyprotein 



f oot-and-tnouth disease virus 



ENTRY GNNY4F fttype complete 

TITLE genone polyprotein - foot-and-mouth disease virus A (strain 

A12) 

CONTAINS coat protein VPU coat protein VP2J coat protein VP3; coat 

protein VP4J core protein p!4; core protein pi9; core 
protein p4i J core protein X; genone-l inked protein VPgl ; 
genome-linked protein VPg2; genoae-l inked protein VPg3; 
nonstructural protein p20a; proteinase (EC 3.4.-.-); 
RNA-directed RNA polymerase (EC 2.7.7.48) 
ftformal_name Aphthovirus A ftcommon_name foot-and-mouth 

disease virus A 
30-Sep-1987 ftsequence_revision 30-Sep-1987 fttext_change 

08-Apr-1994 
A25794 
A25794 

Robertson* B.H.; Grubmanr H.J.J Weddell, G.N.? Moorer D.H.; 
Welshr J.D.; Fischer, T.J Dowbenkot D.J.J Yansurar D.G.J 
Small, B.J Kleid, D.G. 
J. Virol. (1985) 54*651-660 

Nucleotide and amino acid sequence coding for polypeptides of 
foot-and-mouth disease virus type A12. 
ftcross-references HUID: 8521 1015 
^accession A25794 

ftftmolecule type genomic RNA 
ftftresidues 1-2332 ftftlabel ROB 

CLASSIFICATION ^super-family foot-and-mouth disease virus genome polyprotein 
KEYWORDS coat protein; core protein? genome-linked protein; hydrolase; 

nonstructural protein; nucleotidyltransferase J polyprotein; 
proteinase 



ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
ftauthors 



# journal 
fttitle 



FEATURE 

1-216 

217-285 

286-503 

504-723 

724-937 

938-953 

954-1107 

1108-1425 

1426-1578 

1579-1601 

1602-1625 

1626-1649 

1650-1862 

1863-2332 
SUMMARY 
SEQUENCE 

Initial Score = 
Residue Identity - 
Gaps = 



ftproduct nonstructural protein p20a ftlabel NPA\ 
ftproduct coat protein VP4 ftlabel VP4\ 
ftproduct coat protein VP2 ftlabel VP2\ 
ftproduct coat protein VP3 ftlabel VP3\ 
ftproduct coat protein VPI ftlabel VP1V 
ftproduct core protein X ftlabel CPX\ 
ftproduct core protein p!4 ftlabel C14\ 
ftproduct core protein p41 ftlabel C41\ 
ftproduct core protein p!9 ftlabel C19\ 
ftproduct genome-linked protein VPgl ftlabel VG1\ 
ftproduct genome-linked protein VPg2 ftlabel VG2\ 
ftproduct genome-linked protein VPg3 ftlabel VG3\ 
ftproduct proteinase ftlabel PTS\ 
ftproduct RNA-directed RNA polymerase ftlabel RRP 
ftlength 2332 ftmolecular-ueight 259408 ^checksum 6669 



10 Optimized Score = 11 Significance 
337. Hatches = 12 Mismatches 

9 Conservative Substitutions 



4.92 
15 
0 



SLFAYKAATKAGYCGGAVLAKDGADTFIVGTHSAGGNGVCYCSCVSKSNLLRMKAHVDPEPaHEGLIVDTRD 
iSOO 1810 1820 1830 1840 1850 I860 1870 



X 

KAl- 



10 20 X 

-PVVLENARILKNCVDAKHTEEDKE 



VEERVHVHRKTKLAPTVAHGVFMPEFGPAALSNKDPRLMEGVVLDEVIFSKHKGDTKHSAEDKALFRACAAD 
1880 1890 1900 1910 1920 1930 X 1940 



1950 196Q 1970 1980 1990 2000 <:uiu 

REYKFVCQTFLKDEIRPMEK 
2020 2030 



c3"o5oS 300 " 5l °;ub»ln"L. a r ,. W d pc„t.f. ,.p.l I. P"C U r,or 



C30305 fttype complete 

submandibular gland protein (spot 1) precursor - rat 
#formal name Rattus norvegicus #common_name Norway rat 
22^ov-l989 «sequence_revision 22-Nov-1989 #text_change 

17-Feb-1994 
C30305 

"ckinson, D.P.J Hirels, L.J Tabak, L.A.! Gross. K.H. 
Hoi. Biol. Evol. (1989) 6:80-102 

Rapid evolution of variants in a rodent multigene family 
encoding salivary proteins. 
#cross-references MUID:89158788 
Saccession C30305 

#8status preliminary 
##molecule type mRNA 

S&residues 1-91 »#label DIC . 

CLASSIFICATION Ksuperfamily sub. andi bu 1 a, T a ^ a ^ 2 ' 6 * ^^^^77 78 
SUMMARY ftlength 91 Kmolecular-ueight 9227 ftcnecKsum 

SEQUENCE 



ENTRY 
TITLE 
ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
Sauthors 
Sjournal 
Hitle 



Initial Score 
Residue Identity = 
Gaps = 



9 

37X 
0 



Optimized Score = 10 
Matches = 10 

Conservative Substitutions 



Significance = 
Mismatches = 



4.22 
17 
0 



X 10 20 X 

KALPVVLENARILKNCVDAKMTEEOKE 



1 I 



li 



lvscqdIgtdtadtsdtadgttdsgtqadatdgqqdaessdgtsdavdgdapaesdq 



MKFLALLVLLGVSTI 
X 10 



20 



30 



40 



50 



60 



70 



EDSALLALVNTLKEKFTLG 
80 90 



15, 



US-08-300-510-2 U-275 . 
A35072 nonhistone chromosomal protein NHt-t>ft 



yeast (Sacc 



A15072 Stupe complete 

Nonhistone chromosomal protein NHP6A - yeast (Saccharomyces 

cerevisiae) 
ftfornal nane Saccharonyces cerevisiae 

22-Jan-l993 #sequence.revision 22-Jan-l993 itext.change 

06-May-l994 
A35072J S31260J C44031 
A35072 

KolodrubetSi D . ? Burgumt A. 
J. Biol. Chem. (1990) 265:3234-3239 

Duplicated NHP6 aenes of Saccharomyces cerevisiae encode 
proteins homologous to bovine high mobility group protein 
1. 

Bcross-references MUID:90153974 
ftaccession A35072 

##molecule_type DNA 

Mresidues 1-93 ftftlabel K0L 

#tcross-references EMBL'.X15317 
REFERENCE A44031 



ENTRY 
TITLE 

ORGANISM 
DATE 

ACCESSIONS 
REFERENCE 
^authors 
^journal 
#title 



fcaultiut-> rwteto, J.^.i rtnes. «ut.fciie* . n.a. 

tiournal J . Biol. Che*. (1992) 267:20270-20276 

•title Localized mutagenesis and evidence for post-transcnptional 

regulation of HAK3. A putative N-acety I transferase required 
for double-stranded RNA virus propagation in Saccharomyces 
cerevisiae. 
•cross-references MUIO : 93015901 
^accession S31260 
ftftnolecule type DNA 
8#residues 1-93 ittlabel TER 

Sttcross-references EMBLJM95912 

GENETICS 

Sgene LISTA:NHP6A 

CLASSIFICATION 0 " isuperfamily nonhistone chromosomal protein HMG-2; HMG box 

homology 

KEYWORDS chromosomal protein 

FEATURE . «, w , UHfM 

id-93 Sdomain HMG box homology Slabel HP1G1 

SUMMARY length 93 ttnolecular-ueight 10802 #checksum 4901 

SEQUENCE 

initial Score = 9 Optimized Score = 9 Significance = 4.22 

Residue Identity = 337. Matches =9 Mismatches - 18 

Gaps = 0 Conservative Substitutions - u 

X 10 
KALPVVLENARIL 

MVTPREPKKRTTRKKKDPNAPKRALSAYMFFANENRDI VRSENPDITFGQVGKKLGEKWKALTPEEKQPYEA 
10 20 30 40 50 60 70 

20 X 
KNCVDAKMTEEDKE 

i Mil! 
KAQADKKRYESEKELYNATLA 

80 X 90 

> 0 < 

0| |0 IntelliGenetics 

> 0 < 

FastDB - Fast Pairuise Comparison of Sequences 
Release 5.4 

Results file 2-spt.res made by on Fri 24 Mar 95 7'.54:42-PST. 



8uery sequence being compared : US-08-300-510-2 (1-27) 
Number of sequences searched: 40292 
Number of scores above cutoff. 3966 

Results of the initial comparison of US-08-300-510-2 (1-27) 
Data bank : Swiss-Prot 30. all entries 

100000- 
N 

U50000- 

M 

B 

E 

R 



0 

F10000- 



« 

8 



a 

E 5000- 

0 

U 

E 

N 

C 

E 

S 1000- 



500* 



100- 



50- 



10- 



5- * 



sc., i! i i! ! ii ii !i ! ii ti .1 - " 

STDEV -1 0123 5 789 



PARAMETERS 



Similarity matrix 
Mismatch penalty 
Gap penalty 
Gap size penalty 
Cutoff score 
Randomization group 



Unitary 
1 

1,00 
0,05 
0 
0 



Initial scores to save 
Optimized scores to save 



Scores < 



40 
0 



K-tuple 

Joining penalty 
Window size 



Alignments to save 
Display context 



2 
20 
27 



15 
100 



SEARCH STATISTICS 



Mean 
3 



Median 
5 



Times ? 



CPU 
00?00:36. 14 



Standard Deviation 
1.34 

Total Elapsed 
00; 00? 38. 00 



Number of residues? 

Number of sequences searched: 



14147368 
40292 



wtifl&gf " or scores aoove cuuon * 



Cut-off raised to 3. 
Cut-off raised to 4, 
Cut-off raised to 5. 
Cut-off raised to 6. 
Cut-off raised to 7. 

The scores betou are sorted by initial score. 
Significance is calculated based on initial score, 



2 1007. sinilar sequences to the query sequence were found: 

Init. Opt. 

Sequence Name Description Length Score Scor. 8ig._FrM. 



1 FELA FELCA MAJOR ALLERGEN I POLYPEPTIDE 
2. FELB~FELCA MAJOR ALLERGEN I POLYPEPTIDE 

The list of other best scores is'. 



Sequence Name Description 



92 
88 



27 27 17.85 0 
27 27 17.85 0 



Init. Opt. 
Length Score Score Sig. 



Frame 



3. P0LG.FMDV1 

4. P0LG_FMDV0 

5. RPOL^KLULA 

6. POLG.FMDVA 



NHPA_YEAST 
NHPB_YEAST 
9. HEHM.THEZQ 
10. PHEA.FREDI 
ATPF PROMO 
UCRlISYNP2 

13. XYNB_STRLI 

14. YB09_YEAST 
ADH4IYEAST 
PGKB_TRYBB 
PR01~LISM0 
VNUCJNCCA 
PRIMIBPT7 

20. PRIM_BPT3 

21. CEIB.ECOLI 

22. CEIA_ECOLI 

23. CRAC_DICDI 
LON.BACBR 
YKS8 YEAST 

26. POLG.FilDVS 

27. FQX2.YEAST 
BVGC BQRPE 
SYI_YEAST 
BVGS.BORBR 

nosTrat 
ssp5 strsa 



7. 
8. 



11 
12 



15. 
16. 
17. 
18, 
19, 



24. 
25. 



28. 
29. 
30, 

31, 
32, 



33. 
34. 
35. 



NIFW KLEPN 
RL12~EUGGR 
YBEG ECOLI 



36. RL12.SPI0L 

37. INA5.HUMAN 

38. A1AG.RABIT 

39. RIP3 SAPOF 



**** 7 standard deviations above mean 
GENOME PQLYPROTEIN (NONSTRUCT 2333 
***# 5 standard deviations above mean 
GENOME POLYPROTEIN (NONSTRUCT 2332 
PROBABLE DNA-DIRECTED RNA POL 982 
GENOME POLYPROTEIN (NONSTRUCT 2332 

4 standard deviations above mean 
NONHISTONE CHROMOSOMAL PROTEI 93 
NONHISTONE CHROMOSOMAL PROTEI 
MYOHEMERYTHRIN. 
C-PHYCOERYTHRIN ALPHA CHAIN. 
ATP SYNTHASE B CHAINr SODIUM 
CYTOCHROME B6-F COMPLEX IRON- 
ENDO-1 i 4-BETA-XYLANASE B PREC 
HYPOTHETICAL 38.7 KD PROTEIN 
ALCOHOL DEHYDROGENASE IV (EC 
PHOSPHOGLYCERATE KINASE, CYTO 
ZINC METALLOPROTEINASE PRECUR 
NUCLEQPROTEIN. 

DNA PRIMASE, CHAINS A AND B ( 
DNA PRIMASE (EC 2.7.7.-) . 
COLICIN IB PROTEIN. 
COLICIN IA PROTEIN. 
PROTEIN CRAC. 

ATP-DEPENDENT PROTEASE LA (EC 
70 KD PEROXISOMAL MEMBRANE PR 
GENOME POLYPROTEIN (COAT PROT 
PEROXISOMAL HYDRATASE-DE HYDRO 
SENSOR PROTEIN BVGC (EC 2.7.3 
ISOLEUCYL-TRNA SYNTHETASE (EC 
VIRULENCE SENSOR PROTEIN BVGS 
NITRIC-OXIDE SYNTHASE, BRAIN 
AGGLUTININ RECEPTOR PRECURSOR 

*#»« 3 standard deviations 
NIFW PROTEIN. 

50S RIBOSOMAL PROTEIN L12. 
HYPOTHETICAL 18.4 KD PROTEIN 
505 RIBOSOMAL PROTEIN L12, CH 
INTERFERON ALPHA-5 PRECURSOR 
ALPHA-1-ACID GLYCOPROTEIN PRE 
R I BOSQME- INACTIVATING PROTEIN 



##*# 
13 

»#*« 
11 
10 
10 

*#** 
9 



14 7.44 



99 
118 
164 
168 
180 
333 
347 
382 
421 
510 
565 
566 
566 
626 
626 
698 
779 
853 
861 
900 
936 
1072 
1238 
1429 
1473 
above mean 
86 
131 
156 
189 
189 
201 
236 



8 
8 
8 
8 
8 
8 
8 



12 
10 
11 



9 

9 
11 
10 
11 

9 

9 

9 

9 
11 
10 

9 

10 

10 
9 
9 

10 
9 

10 

10 

10 
9 
9 
9 
9 

10 

8 
9 
8 
9 
9 
9 
11 



5.95 
5.21 
5.21 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

4.46 

3.72 
3.72 
3.72 
3.72 
3.72 
3.72 
3.72 



0 
0 
0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 

0 
0 
0 
0 
0 
0 
0 



1. US-08-300-510-2 (1-27) 

FELA FELCA MAJOR ALLERGEN I POLYPEPTIDE CHAIN 1 MAJOR FORM PR 



ID 

AC 

DT 

DT 

DT 

DE 

DE 

GN 

OS 

OC 

OC 

RN 

RP 

RC 

RM 

RA 

RA 

RL 

RN 

RP 

RM 

RA 

RA 

RL 

RN 

RP 

RM 

RA 

RL 

RN 

RP 

RA 

RL 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

DR 

DR 

DR 

DR 

KM 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

SQ 



Initial Score 
Residue Identity 
Gaps 



STANDARD; 



prt; 



92 AA. 



FELA.FELCA 
P30438; 

01-APR-1993 (REL. 25r CREATED) 

01-APR-1993 (REL. 25, LAST SEQUENCE UPDATE) 

Ql-JUN-1994 (REL. 29, LAST ANNOTATION UPDATE) 

MAJOR ALLERGEN I POLYPEPTIDE CHAIN 1 MAJOR FORM PRECURSOR (FEL D I) 

(CAT-l) (AG 4). 

CHI. 

FELIS CATUS (CAT) . 

EUKARYOTAi METAZOA 5 CHORDATA; VERTEBRATA; TETRAPODA; MAMMALIA; 

eutheria; carnivora. 

[13 

SEQUENCE FROM N.A.r AND SEQUENCE OF 23-92. 

TISSUE=SALIVARY GLAND? 

92052157 

MORGENSTERN J. P., GRIFFITH I.J., BRAUER A.M.. ROGERS B.L., 

BOND J.F.- CHAPMAN M.D., KUO M.-C; 

PROC. NATL. ACAD. SCI. U.S.A. 88:9690-9694(1991). 

£23 

SEQUENCE FROM N.A. 
92241678 

GRIFFITH I.J.. CRAIG S.. POLLOCK J.- YU X.-B.r MORGENSTERN J. P.. 

ROGERS B.L. . 

GENE 113:263-268(1992). 

£33 

SEQUENCE OF 23-62, AND CHARACTERIZATION. 
9 128771 4 

DUFFQRT O.A., CARREIRA J., NITTI G., POLO F., LOMBARDERO H.f 

MOL. IMMUNOL. 28:301-309(1991). 

143 

CHARACTERIZATION. 

LEITERMANN K . > OHMAN J.L. JR. ? 

J. ALLERGY CLIN. IMMUNOL. 74J 147-153 ( 1991 ) . 

-'- DISEASE: MAJOR ALLERGEN PRODUCED BY THE DOMESTIC CAT. 

-!- SUBUNIT: HETEROTETRAMER COMPOSED OF TWO NON-COVALENTLY LINKED 

D I SULF I DE-LINKED HETERODIMER OF CHAINS 1 AND 2. 
-'- TISSUE SPECIFICITY: SALIVA, AND SEBACEOUS GLANDS. 
-'- ALTERNATIVE PRODUCTS: USAGE OF TWO DIFFERENT INITIATOR MET ARE 

RESPONSIBLE FOR THE PRODUCTION OF TWO FORMS OF THE SIGNAL SEQUENCE 

OF THIS ALLERGEN SUBUNIT. 
-!- SIMILARITY: TO UTEROGLOBIN. 
EMBL; M74952J FDFELD1. 



PIR; JCl 136 ; JC1136. 




prosite; 


PS00403; 


UTEROGLOBIN 


_1. 


prosite; 


PS00404J 


UTEROGLOBIN 


_2. 


allergen; 


signal; 


ALTERNATIVE 


SPLICING. 


SIGNAL 


1 


22 


MAJOR ALLERGEN I POLYPEPTIDE CHAIN 1. 


CHAIN 


23 


92 


DISULFID 


25 


25 


INTERCHAIN (POTENTIAL). 


DISULFID 


92 


92 


INTERCHAIN (POTENTIAL). 


VARIANT 


51 


51 


K -> N. 


CONFLICT 


5 


5 


R -> C (IN REF. 2). 


CONFLICT 


18 


18 


W -> S (IN REF. 2). 


CONFLICT 


82 


82 


L -> V (IN REF. 2). 


SEQUENCE 


92 AA; 


10252 MW; 


43206 CW, 



27 Optimized Score = 27 Significance = 17.85 
100% Hatches = 27 Mismatches = 0 

0 Conservative Substitutions = 0 



KALPVVLENARILKNCVDAKMT 
I MINI I Mil III! I INN 

MKGARVLVLLWAALLLIWGGNCEICPAVKRDVDLFLTGTPDEYVEQVAQYKALPVVLENARILKNCVDAKMT 
10 20 30 40 50 60 /o 



X 

EEDKE 
Hill 

EEDKENALSLLDKIYTSPLC 
X 80 90 



" FELB^FELCA^ A JOr" ALLERGEN I POLYPEPTIDE CHAIN 1 MINOR FORM PR 

ID FELB_FELCA STANDARD; PRT; 88 AA. 

AC P3Q439; 

DT 01-APR-1993 (REL. 25, CREATED) 

DT 01-APR-1993 (REL. 25, LAST SEQUENCE UPDATE > 

m fll-JUN-1994 (REL. 29, LAST ANNOTATION UPDATE) 

DE MAJOR ALLERGEN I POLYPEPTIDE CHAIN i MINOR FORM PRECURSOR (PEL D I) 

DE (CAT-t) (AG 4) . 

GN CHI. 

II eukaryota; S metazqa; chordata; vertebrata; tetrapoda; mammalia? 

oc eutheria; carnivora. 



RN CI 3 

RP 



SEQUENCE FROM N.A., AND SEQUENCE OF 19-88 
RM 92052157 
RA 



MORGENSTERN J. P. - GRIFFITH 1. J., BRAUER A.M., ROGERS B.L., 

RA BOND J.F., CHAPMAN M.D., KUO H.-C.i 

RL PROC. NATL. ACAD. SCI . U.S.A. 88:9690-9694(1991). 

RN C2 3 

RP SEQUENCE FROM N. A. 

RA 1 GRIFFITH I . J. , CRAIG S., POLLOCK J., YU X.-B., MORGENSTERN J. P., 

RA ROGERS B.L.. 

RL GENE 113:263-268(1992). 

RP SEQUENCE OF 19-58, AND CHARACTERIZATION. 

RA SuFFORtYa., CARREIRA J. , NITTI G., POLO F., LOMBARDERO M. J 

RL MOL. IMMUNOL. 28:301-309(1991). 

RN C43 

RP CHARACTERIZATION. 

RA LEITERMANN K., OHMAN J.L. JR.? 

RL J. ALLERGY CLIN. IMMUNOL. 74:147-153(1991). 

rr -'- DISEASE: MAJOR ALLERGEN PRODUCED BY THE DOMESTIC CAT. 

CC SiSSln: HETEROTETRAMER COMPOSED OF TWO NON-COVALENTLY LINKED 

fC D I SULF I DE-LINKED HETERODIMER OF CHAINS 1 AND 2. 

CC -'- TISSUE SPECIFICITY: SALIVA, AND SEBACEOUS GLANDS. 

CC ALTERNATIVE PRODUCTS: USAGE OF TNQ DIFFERENT INITIATOR MET ARE 

cc ' Responsible for the production of two forms of the signal sequence 

CC OF THIS ALLERGEN SUBUNIT. 

CC -!- SIMILARITY: TO UTEROGLOBIN. 

DR EMBL; M74953; FDFELDIB. 

DR PIR; JC1126; JC1126. 

DR PROSITE; PS00403J UTEROGLOBIN^ . 

DR PROSITE; PS00404! UTER0GL0BIN_2 . 

KW allergen; signal; ALTERNATIVE SPLICING. 

FT CHAIn'" 19 11 MAJOR ALLERGEN I POLYPEPTIDE CHAIN 1. 

FT DISULFID 21 21 INTERCHAIN (POTENTIAL). 

FT DISULFID 88 88 INTERCHAIN (POTENTIAL). 

FT VARIANT 47 47 K -> N. 



h tum-uiut TE 7a l v un act . m. 

SQ SEQUENCE 88 AA; 9614 MW, 39445 CNJ 

initial Score = 27 Optimized Score = 27 Significance - 17.85 
Residue Identity = 100X Matches = & Matches - 0 

Q aps = 0 Conservative Substitutions - v 

X 10 20 

KALPVVLENARILKNCVDAKMTEEDK 

IMIIIIIIINillilllllllli! 
HLDAALPPCPTVAATADCEICPAVKRDVDLFLTGTPDEYVEQVAQYKALPVVLENARILKMCVDAKMTEEDK 

10 20 30 40 X 50 60 70 

X 
E 

I 

ENALSLLDKI YTSPLC 
X 80 



3 " P0LG 8 FMDV1 51 °GEN0ME 2 P0LYPR0TEIN (NONSTRUCTURAL PROTEIN P20A; CO 



ID 

AC 

DT 

DT 

DT 

DE 

DE 

DE 

DE 

OS 

OC 

RN 

RP 

RM 

RA 

RL 

RN 

RP 

RM 

RA 

RL 

CC 

CC 

CC 

CC 

DR 

DR 

KU 

KW 



POLG.FMDVl 
P033Q6; 
21-JUL-1986 
21-JUL-1986 
01-0CT-1994 



STANDARD; 



PRT? 2333 AA. 



(REL. 01 r CREATED) 
(REL. 01, LAST SEQUENCE UPDATE) 
Ol-uti-ivf* (REL. 30, LAST ANNOTATION UPDATE) _ .... _ n 

rFNClME POLYPROTEIN (NONSTRUCTURAL PROTEIN P20A; COAT PROTEINS VP1 TO 
VP4^ CORE PROTEIN P52; GENOME-LINKED PROTEINS VPG1 TO VPG3; PICORNAIN 
3C (EC 3^4.22.28) (PROTEASE 3C) (P3C) ! RNA-DIRECTED RNA POLYMERASE 
( Ef 2 7 7 48))* 

FOOT-AND-MOUTH* DISEASE VIRUS (STRAIN A10-61) (APHTHOVIRUS A) . 

"rVae; ss-rna nonenveloped viruses; picornaviridae; aphthoviruses. 

Cl 3 

SEQUENCE FROM N.A. 
84169547 

CARROLL A.R., ROWLANDS D.J., CLARKE B.E.; 
NUCLEIC ACIDS RES. 12 : 2461-2472 ( 1984) . 
£23 

SEQUENCE OF 115-1048 FROM N.A. 

82211814 , , mi _ _ . , 

BOOTHROYD J.C., HARRIS T.J.R., ROWLANDS D.J., LOWE P. A., 

^^ E pXM; ^PECIFIc'eNZYMATIC CLEAVAGES IN VIVO YIELD MATURE PROTEINS. 
--- SUBUNIT; THE VIRUS CAPSID IS COMPOSED OF 60 ICOSAHEDRAL UNITS, 
' EACH OF WHICH IS COMPOSED OF ONE COPY EACH OF PROTEINS VP1, VP2, 
VP3, AND VP4. 
EHBL; X00429; PIFMDV1. 

POLYPR 0 ofEI N ; cIaTpROTEIN; CORE PROTEIN; RNA-DIRECTED RNA POLYMERASE; 
HYDROLASE; THIOL PROTEASE; MYRISTYLATION. 

NONSTRUCTURAL PROTEIN P20A. 
COAT PROTEIN VP4. 
COAT PROTEIN VP2. 
COAT PROTEIN VP3 . 
COAT PROTEIN VP1. 
CORE PROTEIN P52. 
GENOME-LINKED PROTEIN VPG1. 
GENOME-LINKED PROTEIN VPG2. 
GENOME-LINKED PROTEIN VPG3. 
PROTEASE P20B. 

RNA-DIRECTED RNA POLYMERASE P56A. 
MYRI STATE . 
S -> C (IN REF. 2) . 
P -> L (IN REF . 2). 



FT 


CHAIN 


1 


201 


FT 


CHAIN 


202 


286 


FT 


CHAIN 


287 


504 


FT 


CHAIN 


505 


725 


FT 


CHAIN 


726 


937 


FT 


CHAIN 


938 


1578 


FT 


CHAIN 


1579 


1601 


FT 


CHAIN 


1602 


1625 


FT 


CHAIN 


1626 


1649 


FT 


CHAIN 


1650 


1863 


FT 


CHAIN 


1864 


2333 


FT 


LIPID 


202 


202 


FT 


CONFLICT 


396 


396 


FT 


CONFLICT 


632 


632 



au atautdit. Mm wwi — 25^6*o nw ; i -roots//* oh. 

Initial Score = 13 Optimized Score = U ^"£<- C * I 7 *{J 
Residue Identity * 411 Hatches = 15 """^ = 0 



Gaps = 9 Conservative Substitutions 



GLFAYKAATRAGYCGGAVLAKDGADTFI VGTHSAGGNGVGYCSCVSRSMLQKMKAHVDPEPHHEGLIVDTRD 
1800 1810 1S20 1830 1840 1850 I860 1870 



x 10 20 X 

KAL -pwlenarilkncvdakmteedke 

,,. 



VEERVHVMRKTKLAPTVAYGVFNPEFGPAALSNKDPRLNEGVVLDDVIFSKHKGDAKMTEE 

1880 1890 1900 1910 1920 1930 X 1940 

YASRLHSVLGTANAPLSIYEAIKGVDGLDAMEPDTAPGLPWALQGK^ 

1950 1960 1970 1980 1990 2000 2010 



REYKFACQTFLKDEIRPMEK 
2020 2030 



PO^FmSSo'^gLu^POLYPROTEIN (NONSTRUCTURAL PROTEIN P20A; CO 



ID 

AC 

OT 

DT 

DT 

DE 

DE 

DE 

OS 

OC 

RN 

RP 

RC 

RM 

RA 

RL 

RN 

RP 

RC 

RM 

RA 

RL 

RN 

RP 

RM 

RA 

RL 

CC 

CC 

CC 

CC 

CC 

CC 
CC 
DR 
DR 
DR 
KW 
KW 
FT 



POLG_FMDVO 
P03305; 
21-JUL-1986 
21-JUL-1986 
01-0CT-1994 



STANDARD; 



PRT ; 2332 AA. 



BECK E. i SCHALLER H. 
12:6587-6601 (1984) . 



(REL. 01. CREATED) 
(REL. 01, LAST SEQUENCE UPDATE) 
(REL 30r LAST ANNOTATION UPDATE) 
GENOME POL YPROTE IN (NONSTRUCTURAL PROTEIN P20A; COAT PROTEINS VP1 TO 
VP4; CORE PROTEINS P12, P34, P14; GENOME-LINKED PROTEIN VPG; PROTEASE 
IPC 3 4 22 -); RNA-DIRECTED RNA POLYMERASE (EC 2.7.7.48)). 
FOOT-aSd-MGUTH DISEASE VIRUS (STRAINS OIK AND 01BFS) ( APHTHQVIRUS 0) . 

"rIdae! ss-rna ^enveloped viruses; picornaviridae; APHTHOVIRUSES. 
til 

sequence from n.a. 
strain=oik; 

84297249 

F0RSS S., STREBEL K 
NUCLEIC ACIDS RES 
C2] 

SEQUENCE FROM N.A. 
STRAIN=01BFS; 
831 43292 

MAKOFF A.J.r PAYNTER C.A.. ROWLANDS D.J.. 
NUCLEIC ACIDS RES. 10:8285-8295(1982). 
C3] 

X-RAY CRYSTALLOGRAPHY (2.9 ANGSTROMS) . 
89143740 

ACHARYA R.. FRY E., STUART D., FOX G. 
NATURE 337:709-716(1989) . 

-'- THE STRAIN OIK SEQUENCE IS SHOWN. _ * ■ « 

-i- PTM: SPECIFIC ENZYMATIC CLEAVAGES IN VIVO YIELD MATURE PROTEINS, 
-i- THE COAT PROTEIN VP1 CONTAINS THE MAIN ANTIGENIC DETERMINANTS OF 

THE VIRION; THEREFORE r CHANGES IN ITS SEQUENCE MUST ^E 

RESPONSIBLE FOR THE HIGH ANTIGENIC VARIABILITY OF THE VIRUS. 

SUBUNIT: THE VIRUS CAPSID IS COMPOSED OF 60 ICOSAHEDRAL UNITS. 
' EACH OF WHICH IS COMPOSED OF ONE COPY EACH OF PROTEINS VP1, VP2, 

VP3. AND VP4. 

embl; X00871; pifmdv2. 

EMBLf J02185*. PIQ1VP. 

pir} a03907! gnnyf. 
polyprotein ; coat protein j 
hydrolase; thiol protease; 

CH« N 1 201 



boothroyd j.c; 



ROWLANDS D. . BROWN F . . 



CORE protein; rna-directed rna polymerase; 

MYRISTYLATION. 

NONSTRUCTURAL PROTEIN P20A. 



t- 1 


LtiH i w 


C vfc 


CO 0 


i, u h a r a J * t * w • 


FT 


CHAIN 


287 


504 


COAT PROTEIN VP2. 


FT 


f Lt A T M 

CHA I N 




724 

f c ~ 


COAT PROTEIN VP3. 


FT 


f* it A t ki 

CHA IN 


/ Cu 


V J / 


COAT PROTEIN VP1. 


FT 


CHAIN 


7 JO 


1 1 07 


CORE PROTEIN P12. 


FT 


pit* t M 
CHA IN 


1 1 Oft 


1 425 


CORE PROTEIN P34, 


FT 


CHAIN 


i AO L 
I f CO 


1 ^7ft 


CORE PROTEIN P14. 


FT 


Li A T KI 
CHAIN 


1 ^79 


1 A49 


GENOME-LINKED PROTEIN VPG. 


FT 


CHAIN 


loDU 


1 8A2 
1 DOC 


PROTEASE. 


FT 


CHAIN 


1 flAl 
1 ooo 


2332 


RNA-DIRECTED RNA POLYMERASE. 


FT 


i t o t rv 

LIPID 


*?n2 
cue 


cue 


MYRISTATE. 


FT 


U I bULr 1 U 


oil 


51 1 


INTERCHAIN (IN VP3 DIHER) . 


FT 


U IbULr i U 


An a 


858 


IN VP2-VP1 DIHER. 


FT 


VARIANT 


1 0 A 


/ o U 


I -> V (IN STRAIN Q1BFS) . 


FT 


VAK I AN 1 


OAfl 
OUo 


808 

QUO 


G -> R (IN STRAIN 01BFS) . 


C T 

r i 


ii A £> T AWT 
VAh I AW ! 


ft A 1 


861 


N -> S (IN STRAIN 01BFS). 


ova 




2332 


AA7 258924 


HWJ 19411374 CNJ 



Initial Score = U Qptinized Score = 12 Significance = 5.95 
Residue Identity = 367. Hatches = 13 Mismatches = 14 

Gaps = 9 Conservative Substitutions - 0 

GLFAYRAATKAGYCGGAVLAKDGADTFIVGTHSAGGNGVGYCSCVSRSMLLKMKAHIDPEPHHEGL I VDTRD 
1800 1810 1820 1830 1840 1850 i860 1870 

X 10 20 X 

KAL PWLENARILKNCVDAKMTEEDKE 

II III I I M HO 

vEERVHVMRKTKLAPTVAHGVFNPEFGPAALSNKDPRLNEGVVLDEVIFSKHKGDTKMSEEDKALFRRCAAD 
i860 1890 1900 1910 1920 1930 X 1940 

YASRLHSVLGTANAPLSI YEAIKGVDGLDAMEPDTAPGLPWAL6GKRRGALIDFENGTVGPEVEAALKLMEK 
1950 1960 1970 1980 1990 2000 2010 

REYKFVCQTFLKDEIRPLEK 
2020 2030 



5. US-0B-300-510-2 (1-27) 

RPOL.KLULA PROBABLE DNA-DIRECTED RNA POLYMERASE (EC 2.7.7.6> 

ID RPQL_KLULA STANDARD i PRT; 982 AA. 

AC P05472; 

DT 01-N0V-1988 (REL . 09, CREATED) 

DT 01-N0V-1988 (REL. 09, LAST SEQUENCE UPDATE) 

DT 01-N0V-1988 (REL . 09, LAST ANNOTATION UPDATE) 

DE PROBABLE DNA-DIRECTED RNA POLYMERASE (EC 2.7.7.6) (KILLER PLASMID 

DE PGKL2 PROTEIN 6) . 

OS KLUYVEROMYCES LACTIS (YEAST). 

0G PLASMID PGKL-2. 

oc eukaryota; fungi; ASCOMYCOTINAf hemiascomycetes. 

rn en 

RP sequence from n.a. 

RC STRAIN=CBS 2359; 

RM 88289339 

RA T0MMASIN0 S., RICCI S., GALE0TTI C.L.; 

RL NUCLEIC ACIDS RES. 16:5863-5878(1988). 

RN C21 

RP SEQUENCE FROM N.A. 

RC STRAIN=IFO 1267; 

RM 88335549 

RA WILSON D.W.r MEAC0CK P. A. J 

RL NUCLEIC ACIDS RES. 16:8097-8112(1988). 

CC -'- FUNCTION: THE PRESENCE OF THE TWO LINEAR PLASMIDS, TERMED 
CC PGKLl AND PGKL2 , IN STRAINS OF KLUYVEROMYCES LACTIS CONFERS 

CC THE KILLER PHENQTYPE TO THE HOST CELL, BY PROMOTING THE 



H I U A I (M HDUC IU 1""""' 

CC STRAINS. 
BR EMBL5 X07776; KLPGKL2. 
DR EtlBLJ X07946; KLPQLK. 
no PTR; S00964; S00964. 

KM TRANSCRIPTION ! DNA-DIRECTED RNA POLYMERASE; PLASM1D. 
FT CONFLICT 32 32 T -> N (IN REF. 2). 

FT CONFLICT 302 302 I -> K (IH REF. 2). 

FT CONFLICT 917 917 F -> C ( IN REF. 2) . 

SQ SEQUENCE 982 AA; 113961 MWJ 4923764 CN; 

Initial Scor* - 10 Optimized Score = 10 Significance » 5.21 

r j j. • i _ 11V Mstrhac = 1" n 1 S 1"! 3 Itne S _ *' 

Residue Identity = 37/. natcnes 

l ap \ = 0 Conservative Substitutions - 0 

D ILIGLGAMNTIKEIWSIDRSKIKIDSKTGRINWIRYDKEMEIGQYFKICLSYMRSLGRDILIKNDKYSI VE 
6 40 650 660 670 680 690 700 

X 10 20 X 

KALPVVLENARILKNCVDAKMTEEDKE 

FDNSYLPKTDTMKFGDLDVLDLRIYKGIVMLPLCLRSTYLNKLYVDRK^ 

710 720 730 X 740 750 760 X 770 

GHRVDRCIRSVI VPDPTLDIDTIKIPFGANIGCEYGLLNRQPSLNVDSIKLVKLK6GSNKTIAINPLLC6SF 
glO 820 830 840 o3u 



780 790 800 

NADFDGDEMNI 
860 



PO^FMSvA^^lNui^POLYPROTEIN (NONSTRUCTURAL PROTEIN P20AJ CO 



ID 



PQLG_FMDVA STANDARD; PRT? 2332 AA. 

AC P03308; P03312! 

DT 21-JUL-1986 (REL. 01 > CREATED) 

DT 01-JAN-1988 (REL. 06, LAST SEQUENCE UPDATE) 

nT oi-ncT-1994 (REL. 30r LAST ANNOTATION UPDATE) 

11 rlmll PDLYPRQTE IN (NONSTRUCTURAL PROTEIN P20A; COAT PROTEINS VP1 TO 

Sf 5p4 CORE mStEWS PH. P41, P19I GENOME-LINKED PROTEINS VPG1 TO 

DE VPG3 J PICORNAIN 3C !eC 3.4.22.28) (PROTEASE 3C» (P3C). RNA-DIRECTED 

11 Fu^Td-MO^H DISEASE* VIRUS^ "(STRAIN A12) (ARBOVIRUS A). 

II V?RIDAE; SS RNA NONENVELOPED VIRUSES? P1C0RNAVIRIDAE; APHTHOVIRUSES. 

RN Ell 

RP SEQUENCE FROM N.A. 

rS RoIer°T80N B.H., GRUBMAN H.J., WEDDELL G.N , MOORE D M. -WELSH J.D. - 

RA FISCHER T., DOWBENKO D.J.. YANSURA D.G.. SMALL B.. KLEID D.G.. 

RL J. VIROL. 54!651-660(1985) . 

RN C21 

RP SEQUENCE OF 1863-2332 FROM N.A. 

rS ROBERTSON B.H.. MORGAN D.O., MOORE D.H., GRUBMAN M.J., CARD J.. 
RA FISCHER T., WEDDELL G.N.. DOWBENKO D.J.r YANSURA D.G.; 
RL VIROLOGY 126;614-623(1983) . 
RN t33 

RP SEQUENCE OF 715-955 FROM N.A. 

II SITg., YANSURA D.G., SMALL B. < DOWBENKO D. J.. MOORE D.M., 



GRUBMAN V. J.. MCKERCHER P.D.. MORGAN D.O.. ROBERTSON B.H. 



RA KLEID 
RA 

RA BACHRACH H.L.'r 

cc " IE ?5«- 2 s'pEclnc"»niInc-ct E »vA CE s in «vo yield hatim noteim. 

" -\- sILn! THE VISUS CAPSID IS COMPOSED OF 60 ICQSAHEDRAL UNITS, 



tr- 
ee 

DR 
DR 
KW 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SG 



LACK U> WI TIUff IS LUtlKUatU ur UKc UUf» trt^rt br KhtiJLi.vi^ 
VP3 i AND VP4. 



EMBLJ M109755 APHA12CD. 
PIRI A25794! GNNY4F. 



CHAIN 

CHAIN 

CHAIN 

CHAIN 

CHAIN 

CHAIN 

CHAIN 

CHAIN 

CHAIN 

CHAIN 

CHAIN 

CHAIN 

CHAIN 

CHAIN 

LIPID 

SEQUENCE 



CORE PROTEIN; RNA-DIRECTED RNA POLYMERASES 



t niuL 


PROTEASE ; 


MYRISTYLATION. 




i 
i 


900 

Cvv 


NONSTRUCTURAL PROTEIN 


P20A. 






COAT PROTEIN VP4. 




9ft A 
COO 




COAT PROTEIN VP2. 




CA/I 

Du*+ 


791 


fOAT PROTEIN VP3. 




/ CH 


QT7 
7j/ 


COAT PROTEIN VP1. 




7 JO 


QCJ7 


CORE PROTEIN X. 




OR A 


1 1 07 


CORE PROTEIN P14. 




1 1 Oo 


1 *fr C 3 


CORE PROTEIN P41. 




i *rco 


1 S7P. 
i %J / o 


CORE PROTEIN P19. 




i J/7 


1 A0 1 


GENOME-LINKED PROTEIN 


VPGl. 


1 402 


1625 


GENOME-LINKED PROTEIN 


VPG2. 


1626 


1649 


GENOME-LINKED PROTEIN 


VPG3 . 


1650 


1862 


PROTEASE. 




1863 


2332 


RNA-DIRECTED RNA POLYMERASE. 


201 


201 


MYRISTATE. 




2332 


AA? 259408 MW; 19347576 CN; 





Initial Score 
Residue Identity 
Gaps 



10 Optinized Score = 11 
337. Matches = 12 

9 Conservative Substitutions 



Significance = 5.21 
Mismatches = 15 

0 



SLFAYKAATKAGYCGGAVLAKDGADTF I VGTHSAGGNGVGYCSCVSKSMLLRMKAHVDPEPQHEGLI VDTRD 
1800 1810 1820 1830 1840 1850 I860 1870 

X 10 20 X 

KAL PVVLENARILKNCVDAKMTEEDKE 

II Ml I I 1 1 IH 

VEERVHVMRKTKLAPTVAHGVFNPEFGPAALSNKDPRLMEGVVLDEVIF3KHKGDTKHSAEDKALFRACAA0 

1900 1910 1920 1930 X ,D 



1880 



1890 



1940 



YASRLHSVLGTANAPLSIYEAIKGVDGLDAMESDTAPGLPWAFQGKRRGALIDFENGTVGPEVEAALKLMEK 

1970 1980 1990 2000 2010 



1950 



I960 



REYKFVCQTFLKDEIRPMEK 
2020 2030 



7. US-08-300-510-2 (1-27) 

NHPA_YEAST NONHISTONE CHROMOSOMAL PROTEIN 6A. 

ID NHPA_YEAST STANDARD; PRT; 93 AA. 

AC PI 1632 ; 

DT 01-0CT-1989 (REL. 12r CREATED) 

DT 01-0CT-1989 (REL. 12- LAST SEQUENCE UPDATE) 

DT 01-0CT-1993 (REL. 27, LAST ANNOTATION UPDATE) 

DE NONHISTONE CHROMOSOMAL PROTEIN 6A. 

GN NHP6A OR NHPA. 

OS SACCHAROMYCES CEREVISIAE (BAKER'S YEAST). 

oc eukaryota; fungi; ascomycqtina; hemiascomycetes. 

rn m 

RP sequence from n.a. 

RC STRAIN=DBY1091 / DKY1; 

RM 90153974 

RA KOLODRUBETZ D.r BURGUM A.; 

RL J. BIOL. CHEM. 265:3234-3239(1990). 

RN C21 

RP SEQUENCE FROM N.A. 

RM 93015901 

RA TERCERO J.C., RILES L.E., WICKNER R.B.) 
RL J. BIOL. CHEM . 267:20270-20276(1992). 



CC -!- SUBCELLULAR LOCATION! NUCLEAR. 
DR EMBL; X15317; SCNHP6A. 
DR EMBL; M95912; SCMAKNHP. 
DR PIR; A35072; A35072. 
nR PIR; C44031J C44031. 

NUCLEAR PROTEINJ CHROMOSOMAL PROTEIN; DNA-BINDING. 
DNA BIND 18 93 HMG BOX. 

SEQUENCE 93 AA; 10802 MW; 41127 CN; 

initial Score = 9 Optimized Score = 9 Significance = 4.46 

Initial Score Matches = 9 Mismatches = 18 

Residue Identity = 33A Watches . 

J!" = 0 Conservative Substitutions - 0 



KM 
FT 
SQ 



X 10 
KALPVVLENARIL 

MVTPREPKKRTTRKKKDPNAPKRALSAYMFFANENRDIVRSENPDITFGQVGKKLGEKWKAL.TPEEKQPYEA 
l0 20 30 40 50 60 

20 X 
KNCVDAKMTEEDKE 

i ! I I II 
KASADKKRYESEKELYNATLA 

80 X 90 



i US-08-300-510-2 (1-27) 

NHPB.YEAST NONHISTONE CHROMOSOMAL PROTEIN 6B. 

ID NHPB.YEAST STANDARD; . PRT; 99 AA. 

AC PI 1633; 

DT 01-0CT-1989 (REL. 12. CREATED) 

DT 01-0CT-1989 (REL. 12, LAST SEQUENCE UPDATE) 

DT 01-0CT-1993 (REL. 27, LAST ANNOTATION UPDATE) 

DE NONHISTONE CHROMOSOMAL PROTEIN 6B. 

GN NHP6B OR NHPB. wr - 4PTN 

OS SACCHAROMYCES CEREVISIAE (BAKER'S YEAST). 

OC EUKARYOTA? FUNGI J ASCOMYCOTINA; HEMIASCOMYCETES. 

RN (M 

RP SEQUENCE FROM N.A. 

RC STRAIN=DBY1091 / DKY1J 

RM 90153974 

RA KOLODRUBETZ D.. BURGUM A.; 

RL J BIOL. CHEM. 265:3234-3239(1990). 

CC -!- SIMILARITY; TO MAMMALIAN NONHISTONE PROTEIN HMG1. 

CC -!- SUBCELLULAR LOCATION: NUCLEAR. 

DR EMBL; X15318! SCNHP6B. 

DR PIR: B35072J B35072. 

KW NUCLEAR PROTEIN ; CHROMOSOMAL PROTEIN; DNA-BINDING. 

FT DNA BIND 24 99 HMG BOX. 

SQ SEQUENCE 99 AA; 11476 MM; 45060 CN; 

initi.l Score = 9 Optimized Score = 9 Significance = 4.46 

Residue Identity = 337. Matches 



9 Mismatches = 18 



= o Conservative Substitutions 

u ap s 



0 
X 

KALPVVL 



MAATKEAKQPKEPKKRTTRRKKDPNAPKRGLSAYMFFANENRDIVRSENPDVTFGQVGRILGERWKALTAEE 
10 20 30 40 



50 60 X 70 



10 20 X 

ENARILKNCVDAKHTEEDKE 

1 II I II 



80 90 X 



9. US-08-300-510-2 (1-27) 

HEMM THEZO MYOHEMERYTHRIN. 



ID 

AC 

DT 

DT 

DT 

DE 

OS 

OC 

RN 

RP 

RM 

RA 

RL 

RN 

RP 

RM 

RA 

RL 

RN 

RP 

RM 

RA 

RL 

RN 

RP 

RM 

RA 

RL 

CC 

CC 

CC 

CC 

CC 

CC 

DR 

DR 

DR 

KM 



HEMM_THEZO STANDARD ; PRT; 
P02247; 
21-JUL-1986 
01-0CT-1993 
01-0CT-1993 

myohemerythrin . 
themiste zostericola. 

eukaryqta; metazoa; sipuncula; golfingiidae. 

[ 1 3 

SEQUENCE. 
76136381 

KLIPPENSTEIN G.L.. COTE J.L., LUDLAM S.E.; 
BIOCHEMISTRY 15:1128-1136(1976). 

m 

X-RAY CRYSTALLOGRAPHY (5.5 ANGSTROMS). 
75176901 

HENDRICKSON W.A., KLIPPENSTEIN G.L., HARD K.B.; 
PROC. NATL . ACAD. SCI. U.S.A. 72:2160-2164(1975). 
121 

X-RAY CRYSTALLOGRAPHY (1.7 ANGSTROMS) 
88062755 

SHERIFF S.r HENDRICKSON W.A., SMITH J.L.J 

J. MOL. BIOL. 197:273-296(1987). 

C41 

STRUCTURE. 
77165245 

HENDRICKSON W.A., WARD K.B.J 

J. BIOL. CHEM. 252:3012-3018(1977). 

-'- FUNCTION: MYOHEMERYTHRIN IS AN OXYGEN-BINDING PROTEIN FOUND IN 
THE RETRACTOR MUSCLES OF CERTAIN WORMS. THE OXYGEN-BINDING SITE 
CONTAINS TWO IRON ATOMS. 

-!- SUBUNIT: MONOMER. 

-'- TISSUE SPECIFICITY: MUSCLE. 

-!- SIMILARITY: TO HEMERYTHRINS FROM VARIOUS MARINE WORMS. 
PIRJ A37369; HRTHM. 
PDB; 2MHR; 16-APR-88. 
PROSITE! PS00550J HEMERYTHRINS . 
OXYGEN TRANSPORT; MUSCLE PROTEIN; 



118 AA. 



(REL. 01. CREATED) 

(REL. 27, LAST SEQUENCE UPDATE) 

(REL. 27, LAST ANNOTATION UPDATE) 



AND REVISION TO 34-35. 



METAL-BINDINGf IRON; 3D-STRUCTURE 



FT 


METAL 


25 


25 


IRON 1. 


FT 


METAL 


54 


54 


IRON 1. 


FT 


METAL 


58 


58 


IRON 1 AND 2. 


FT 


METAL 


73 


73 


IRON 2. 


FT 


METAL 


77 


77 


IRON 2. 


FT 


METAL 


106 


106 


IRON 2. 


FT 


METAL 


111 


111 


IRON 1 AND 2. 


FT 


HELIX 


12 


14 




FT 


HELIX 


19 


37 




FT 


HELIX 


41 


64 




FT 


TURN 


65 


66 




FT 


TURN 


68 


69 




FT 


HELIX 


70 


85 




FT 


TURN 


86 


86 




FT 


HELIX 


93 


109 




FT 


TURN 


110 


110 




FT 


HELIX 


111 


114 




FT 


TURN 


115 


117 




SQ 


SEQUENCE 


118 AA; 


13778 


MW; 75202 CN! 



Initial Score 



9 Optimized Score = 



11 Significance = 4,4 



9 Conservative Substitutions = u 

X 

KALPVV- 
! II 

GHEIPEPYVHDESFRVFYESLDEEHKKIFKGIFDCIRDNSAPNLATLVKVTTNHFTHEEAMMDAAKYSEVVP 

10 20 30 40 50 60 X 

10 20 X 
LENARILKNCVDAKMTEEDKE 

11 I Mil II 
HKKMHKDFLEKIGGLSAPVDAKNVDYCKEMLVNHIKGTDFKYKGKL 

80 90 100 HO 



Gaps 



10. US-08-300-510-2 (1-27) 

PHEA.FREDI C-PHYCOERYTHRIN ALPHA CHAIN. 

ID PHEA_FREDI STANDARD J PRT! 164 AA. 

AC P05Q98; 

DT 13-AUG-1987 (REL. 05, CREATED) 

DT 13-AUG-1987 (REL. 05, LAST SEQUENCE UPDATE) 

DT 01-JUL-1993 (REL. 26, LAST ANNOTATION UPDATE) 

DE C-PHYCOERYTHRIN ALPHA CHAIN. 

GN CPEA. ,„,, 

QS FREMYELLA DIPLOSIPHON (CALOTHRI X PCC 7601). 

QC PROKARYOTAf GRACILICUTESJ oxyphotobacteria; 

OC CYANQBACTERIA (BLUE-GREEN ALGAE) ! NOSTOCALES. 

RN [13 

RP SEQUENCE FROM N. A . 

Z hI?" 7 d!, GUGLIELMI G., HOUMARD J., SIDLER W., BRYANT D.A.. 

RA TANDEAU DE MARSAC N. J 

RL NUCLEIC ACIDS RES. 14:8279-8290(1986). 

RN £21 

RP SEQUENCE. 

RH 87000169 

SIDLER U.. KUilPF B., RUDIGER M.. ZUBER H., 

!:°^ u ^oru^r E sn;frH0 4 ro^:?^K bile pigment-protein 

| - SK^Lr^TirU!.^ THE RODS OF THE PH,COB,USOH E . 

CC -'- SUBUNIT: HETERODIMER OF AN ALPHA AND A BETA CHAIN. 

-i- PTM: CONTAINS ONE COVALENTLY LINKED BILIN CHROMOPHORE . 
-!- INDUCTION: IN RESPONSE TO GREEN LIGHT BUT NO TO RED LIGHT. 
DR EMBL; X04592J FDCPEAB. 

kS PHYCQBILISONEJ^ELECTRQN TRANSPORT. PHOTOSYNTHESIS : BILE PIGMENT . 
FT BINDING 82 82 PHYCQERYTHROBILIN CHROMOPHORE. 

FT BINDING 139 139 PHYCOERYTHROBILIN CHROMOPHORE. 

SQ SEQUENCE 164 AA; 17626 MW, 123449 CNJ 

Initial Score = 9 Optimized Score = 10 Significance = 4.46 

Initial «ore _ Matches = 1 1 Misnatches = 16 

Residue Identity - 29/. natcnes „......, = o 

Gaps = io Conservative Substitutions 

Y 10 20 X 

KALPVV LENARILKNCVDAKMTEEDKE 

MKSVVTTVI AAADAAGRFPSTSDLESVQGSIQRAAARLEAAEKLANNIDAVATEAYNACIKKYPYLNNSGEA 

10 20 X 30 40 50 60 rv 

NSTDTFKAKCARD IKHYLRLIQYSLVVGGTGPLDEWGI AGQREVYRALGLPTAPYVEALSFARNRGCAPRDM 
80 90 100 HO 120 130 1<tu 



RA 
RL 
CC 



CC 
CC 



150 



11. US-08-300-510-2 (1-27) 

ATPF.PR0M0 ATP SYNTHASE B CHAINr SODIUM ION SPECIFIC (EC 3.6. 

ID ATPF_PROMO STANDARD; PRT; 168 AA. 

AC P21904; 

DT 01-MAY-1991 (REL. 18, CREATED) 

DT 01-APR-1993 (REL. 25r LAST SEQUENCE UPDATE) 

DT 01-0CT-1994 (REL. 30, LAST ANNOTATION UPDATE) 

DE ATP SYNTHASE 6 CHAIN. SODIUM ION SPECIFIC (EC 3.6.1.37). 

GN ATPF OR UNCF. 

OS PROPIONIGENIUM MODESTUM. 

oc prokaryota; gracilicutes; scotobacteria; anaerobic rods; 

oc bacteroidaceae. 

RN en 

RP sequence from n.a. 

RC STRAIN=DSM 2376; 

RM 91067471 

RA KAIM G. t LUDWIG W., DIMROTH P.. SCHLEIFER K.H.; 

RL NUCLEIC ACIDS RES. 18:6697-6697 ( 1990) . 

RN C23 

RP SEQUENCE FROM N.A. 

RC STRAIN=DSM 2376; 

RM 92339434 

RA KAIM G.r LUDWIG W . » DIMROTH P.r SCHLEIFER K.H.; 

RL EUR . J. BIOCHEM. 207:463-470(1992). 

RN C3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=DSM 2376; 

RM 91016937 

RA ESSER U., KRUMHOLZ L.R., SIMONI R.D.; 

RL NUCLEIC ACIDS RES. 1855887-5888(1990). 

RN C4] 

RP SEQUENCE OF 1-7. 

RM 93138123 

RA GERIKE U.r DIMROTH P.; 

RL FEBS LETT. 316:89-92(1993). 

CC -'- SUBUNIT: F-TYFE ATPASES HAVE 2 COMPONENTS. CF(1) - THE CATALYTIC 
CC CORE - AND CF(0! - THE MEMBRANE PROTON CHANNEL. CFU) HAS FIVE 

CC SUBUNITS: ALPHA (3) r BETA (3) r GAMMA ( 1 ) » DELTA ( 1 ) r EPSILON(l). CF(O) 

CC HAS THREE MAIN SUBUNITS: A. B AND C. 

CC -'- THE ATPASE OF P. MODESTUM IS OF SPECIAL INTEREST BECAUSE IT 
CC USES SODIUM IONS INSTEAD OF PROTONS AS THE PHYSIOLOGICAL 

CC COUPLING ION. 

CC -!- SIMILARITY: TO OTHER B SUBUNITS AND ALSO TO B' SUBUNITS. 

DR EMBL! X54810; PMATPBS. 

DR EMBL; X66102! PMATPACBD. 

DR EMBL; X53960; PMUNC1. 

DR EMBL; X58461! PMUNC2. 

DR PIR; S12620; S12620. 

DR PIR; S23323; S23323. 

DR PIR; S23336; S23336. 

KM SODIUM TRANSPORT ! TRANSMEMBRANE J CF(O). 

SQ SEQUENCE 168 AA; 19201 MW; 124854 CN; 

Initial Score = 9 Optimized Score = 11 Significance = 4.46 

Residue Identity = 447. Matches = 12 Mismatches = 14 

Gaps = I Conservative Substitutions = 0 

X 10 20 X 

KALPWLENARILKNCVDAKMTEEDKE 

I II II 1 I I I HI 
MAPQNMPAVSIDINMFMQI INFLILMFFFKKYFQKPI AKVL-DARKEKIANDLKQAEIDKEMAAKANGEAQG 



IVKSAKTEANEMLLRAEKKADERKETILKEANTQREKMLKSAEVEIEKMKEQARKEL6LEVTDLAVKLAEKM 

110 120 130 140 



80 



90 



100 



I NEKVDAK I GANLLDQF 
150 160 



12 ' UCRrsYNP2 5l °CYT0CHR0«E B6-F COMPLEX IRON-SULFUR SUBUNIT PRECUR 



10 

AC 

DT 

DT 

DT 

DE 

DE 

GN 

OS 

QC 

OC 

RN 

RP 

RA 

RL 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

DR 

DR 

DR 

KW 

KW 

FT 

FT 

FT 

FT 

SQ 



UCRI.SYNP2 
P26292; 
01-MAY-1992 
Ql-MAY-1992 
01-MAY-1992 



STANDARD; 



PRT I 180 AA. 



(REL. 22, CREATED) 
(REL. 22. LAST SEQUENCE UPDATE) 
(REL. 22, LAST ANNOTATION UPDATE) 
CYTOCHROME B6-F COMPLEX IRON-SULFUR SUBUNIT PRECURSOR (EC 1.10.99.1) 
(RIESKE IRON-SULFUR PROTEIN) . 
PETC 

SYNECHOCOCCUS SP. (STRAIN PCC 7002) (AGMENELLUM QUADRUPLICATUM) . 

PRQKARYOTA; gracilicutes; oxyphotobacteriaj 

CYANOBACTERIA (BLUE-GREEN ALGAE) ! CHROOCOCCALES. 
[13 

SEQUENCE FROM N.A. 
WIDGER M.R.; 

SUBMITTED (XXX-1992) TO EMBL/GENBANK/DDBJ DATA BANKS. 

-•- FUNCTION: COMPONENT OF THE CYTOCHROME B6/F COMPLEX WHICH IS PART 

OF THE CHLOROPLASTIC RESPIRATORY CHAIN. 
-'- THE RIESKE PROTEIN IS A HIGH .POTENTIAL 2FE-2S PROTEIN. 
->~ CATALYTIC ACTIVITY: PLASTOQUINOL-1 + 2 OXYDIZED PLASTQCYANIN = 

PLAST06UIN0NE + 2 REDUCED PLASTOCYANIN . 
-'- SUBUNIT : THE MAIN SUBUNITS OF COMPLEX B6-F ARE: CYTOCHROME B6, 

17 KD POLYPEPTIDE (PETD) , CYTOCHROME F AND THE RIESKE PROTEIN. 
-!- SIMILARITY: TO RIESKE PROTEINS FROM OTHER SOURCES (MITOCHONDRIA, 

BACTERIAL, CHLOROPLAST) . 
EMBLJ M74514? AQPETAC. 
PRQSITE; PS00199; RIESKE_1. 
PROSITEf PS00200J RIESKE 2. 

ELECTRON TRANSPORT; INNER MEMBRANE; TRANSMEMBRANE; IRON-SULFUR; 
OXIDATIVE PHOSPHORYLATION; RESPIRATORY CHAIN. 

IRON-SULFUR CLUSTER 
IRON-SULFUR CLUSTER 
IRON-SULFUR CLUSTER 
IRON-SULFUR CLUSTER 
172619 CNJ 



METAL 
METAL 
METAL 
METAL 
SEQUENCE 



108 
113 
126 
129 

180 aa; 



108 
113 
126 
129 
19178 



(2FE 
(2FE 
(2FE 
(2FE 



-2S) 
-2S) 
-2S) 
-2S) 



(POTENTIAL) 
(POTENTIAL) 
(POTENTIAL) 
(POTENTIAL) 



MW 



9 

337. 
0 



Optinized Score 
Matches 



Significance = 4.46 
Mismatches = 18 



Conservative Substitutions 



Initial Score = 
Residue Identity = 
Gaps = 

LYPVIKYFIPPSSGGAGGGVIAKDALGNDI IVSDYLQTHTAGDRSLAQGLKGDPTYVVVEGDNTISSYGINA 

60 70 80 90 100 



40 



50 



X 10 20 X 

KALPVVLENARILKNCVDAKMTEEDKE 

II 1 II Mil 

ICTHLGCVVPWNTAENKFMCPCHGSQYDETGKVVRGPAPLSLALVHAEVTEDDKISFTDWTETDFRTDEAPW 

HO 120 130 X 140 150 160 170 

WA 
180 



13. US-08-300-510-2 (1-27) „.„,*« 

XYNB STRLI ENDO-1 , 4-BETA-XYLANASE B PRECURSOR (EC 3.2.1.8) 



(X 



ID XYNB_STRLI STANDARD; PRT; 333 AA. 

AC P26515; 

DT 01-AUG-1992 (REL. 23, CREATED) 

DT Gl-AUG-1992 (REL. 23, LAST SEQUENCE UPDATE? 

DT 01-JUN-1994 (REL. 29, LAST ANNOTATION UPDATE) 

DE ENDO-1 1 4-BETA-XYLANASE B PRECURSOR (EC 3.2.1.8) < X YLANASE B) 

DE (1 ,4-BETA-D-XYLAN X YLANOHYDROLASE B>. 

GN XLNB. 

OS STREPTOMYCES LIVIDANS. 

OC PROKARYOTA; FIRMICUTES; ACTINOMYCETALES; STREPTQMYCETACEAE. 

RN [13 

RP SEQUENCE FROM N.A., AND SEQUENCE OF 41-71. 

RC STRAIN=1326; 

RM 92077439 

RA SHARECK F., ROY C, YAGUCHI PI., MOROSOLI R. , KLUEPFEL D.J 

RL GENE 107:75-82(1991) . 

CC -!- FUNCTION? CONTRIBUTES TO HYDROLYSE HEMICELLULQSE , THE MAJOR 
CC COMPONENT OF PLANT CELL-WALLS. XLNA AND XLNB SEEM TO ACT 

CC SEQUENTIALLY ON THE SUBSTRATE TO YIELD XYLOBIOSE AND XYLOSE 

CC AS CARBON SOURCES. 

CC -!- CATALYTIC ACTIVITY: ENDOHYDROLYSIS OF 1 , 4-BETA-D-XYLOSIDIC 

CC LINKAGES IN XYLANS. 

CC -!- PATHWAY : XYLAN DEGRADATION. 

CC -!- SUBCELLULAR LOCATION: EXTRACELLULAR. 

CC -!- SIMILARITY: BELONGS TO CELLULASE FAMILY G (FAMILY 11 OF GLYCOSYL 

CC HYDROLASES). 

DR EMBLJ M64552; SLXLNB . 



DR 


prqsite; 


PS00776; 


GLYCOSYL 


_HYDR0L_F11_1. 


DR 


prosite; 


PS00777; 


GLYCOSYL 


_HYDR0L_F11_2. 


KU 


XYLAN DEGRADATION; 


HYDROLASE; GLYCOSIDASE; 


FT 


SIGNAL 


1 


40 




FT 


CHAIN 


41 


333 


XYLANASE B. 


FT 


ACT_SITE 


127 


127 


BY SIMILARITY. 


FT 


ACT_SITE 


194 


194 


BY SIMILARITY. 


FT 


ACT.SITE 


217 


217 


BY SIMILARITY. 


SG 


SEQUENCE 


333 AA; 


35426 


MW; 558782 CN; 



Initial Score = 9 Optimized Score = 9 Significance = 4.46 

Residue Identity = 33% Matches = 9 Misnatches = 18 

Gaps = 0 Conservative Substitutions = 0 

RTGGTITTGNHFDAWARAGMPLGNFSYYMIMATEGYQSSGTSSINVGGTGGGDSGGGDNGGGGGGCTRRCPP 
190 200 210 220 230 240 250 

X 10 20 X 

KALPVVLENARILKNCVDAKMTEEDKE 

ft I! II I II 

GRSGATGTTSTSPSAAPRLDGDDERAVPGEGPVDLER6RQLSQCADADRQLNGSGNNWGATIQANANWTWPS 
260 270 280 X 290 300 310 320 

VSCSAG 
330 



14. US-08-300-510-2 (1-27) 

YB09_YEAST HYPOTHETICAL 38.7 KD PROTEIN IN RPB5-CDC28 INTERGE 

ID YB09_YEAST STANDARD? PRT; 347 AA. 

AC P38286! 

DT Ol-OCT-1994 (REL. 30, CREATED) 

DT Ol-OCT-1994 (REL. 30, LAST SEQUENCE UPDATE) 

DT Ol-OCT-1994 (REL. 30, LAST ANNOTATION UPDATE) 

DE HYPOTHETICAL 38.7 KD PROTEIN IN RPB5-CDC28 INTERGENIC REGION. 

GN YBR159W OR YBR1209. 

OS SACCHAROMYCES CEREVISIAE (BAKER'S YEAST). 



RN tn 

Rp SEQUENCE FROM N.A. 

RC STRAIN=S288C; , , 

RA ENTIAN K.D.. KOETTER P.. ROSE H.. BECKER J., GREY M.r LI Z., 

RA NIEGEMANN E.. SCHENK-GROENINGER R.. SERVOS J.r WEHNER E., 

RA WQLTER R.r BRENDEL M., BAUER J.r BRAUN H., DERN K., DUESTERHUS S., 

RA GRUENBEIN R.. HEDGES D., KIESAU P., KOROL S.. KREMS B., PROFT M.. 

RA SIEGERS K. > BAUR A.r BOLES E., MIOSGA T.r 

RA SCHAAFF-GERSTENSCHLAEGER I., ZIMMERMANN F.K.; 

RL SUBMITTED (AUG-1994) TO EMBL/GENBANK/DDBJ DATA BANKS. 

DR EMBLJ Z36028: SCYBR159W. 

DR PIR; S46030; S46030. 

KW HYPOTHETICAL PROTEIN? TRANSMEMBRANE . 

FT TRANSMEM 17 37 POTENTIAL. 

FT TRANSMEM 39 59 POTENTIAL. 

SQ SEQUENCE 347 AA J 38708 HU; 656470 CN; 

Initial Score = 9 Optimized Score = 9 Significance = 4.46 

Residue Identity = 337. Matches = 9 Mismatches = 18 

Q aps s 0 Conservative Substitutions - « 

HTFMQQLQEAGERFRC INGLLWVVFGLGVLKCTTLSLRFLALIFDLFLLPAVNFDKYGAKTGKYCAITGASD 
10 20 30 40 50 60 70 

X 10 20 X 

KALPVVLENAR ILKNCVDAKMTEEDKE 

II il 1 (HI 

GIGKEFARQMAKRGFNLVLISRTQSKLEALQKELEDQHHVVVKILAIDI AEDKESNYESIKELCAQLPITVL 

80 90 100 110 120 X 130 140 

VNNVGQSHSIPVPFLETEEKELRNI ITINNTATLLITQI IAPKI VETVKAENKKSGTRGLILTMGSFGGLIP 
150 160 170 180 190 200 210 

TPLLATYSGS 
220 



15. US-08-300-510-2 (1-27) 

ADH4_YEAST ALCOHOL DEHYDROGENASE IV (EC 1. 1.1.1). 

ID ADH4_YEAST STANDARD ! PRT? 382 AA. 

AC P1Q127; 

DT 01-MAR-1989 (REL. 10. CREATED) 

DT 01-MAR-1989 (REL. 10, LAST SEQUENCE UPDATE) 

DT 01-0CT-1993 (REL. 27, LAST ANNOTATION UPDATE) 

DE ALCOHOL DEHYDROGENASE IV (EC 1.1.1.1). 

GN ADH4. 

OS SACCHAROMYCES CEREVISIAE (BAKER'S YEAST). 

OC EUKARYOTA; FUNGI J ASCOMYCOTINA; HEMIASCOMYCETES. 

RN ill 

RP SEQUENCE FROM N.A. 

RM 88038383 

RA WILLIAMSON V.M., PAQUIN C.E.J 

RL MOL. GEN. GENET. 209:374-381 ( 1987) . _ 

CC FUNCTION: NOT KNOWN YET, AS IN YEAST ADH4 IS NOT EXPRESSED UNDER 

CC LABORATORY CONDITIONS EXCEPT UPON INSERTION OF A TY AT THE ADH4 

CC LOCUS OR AMPLIFICATION OF ADH4. „ CT „Mr * M *n U 

CC -'- CATALYTIC ACTIVITY: ALCOHOL + NAD < + ) = ALDEHYDE OR KETONE + NADH. 

CC -!- SIMILARITY: BELONGS TO THE IRON-CONTAINING ALCOHOL DEHYDROGENASE 

CC FAMILY. 

DR EMBLJ X05992J SCADH4. 

DR PIR: S07614! DEBY4. 

DR PROSITE; PS00060! ADH_IR0N_2. 

DR PROSITE; PS00913? ADH_IR0N_1. 

KW OXIDOREDUCTASE*, NAD; IRON. 



U btttOtUlt — 35^"wr 4iou nwf ticctb 

initial Score - 9 Optimized Score - 9 Significance = 4.46 

Residue Identity - 337. Matches = 9 Mismatches _ 18 

Gaps = 0 Conservative Substitutions - u 

DLINESLVAAYKDGKDKKARTDMCYAEYLAGMAFNNASLGYVHALAHSLGGFYHLPHGVCNAVLLPHVSEAN 
230 240 250 260 270 280 

X 10 20 X 

KALPVvLENARILKMCVDAKMTEEDKE 

till 111 II 

MQCPKAKKRLGEI ALHCGASQEDPEET IKALHVLNRTMNIPRNLKDLGVKTEDFDILAEHAMHDACHLTNPV 

300 310 320 X 330 340 350 J6U 

SFTKEQVVAI IKKAYEY 
370 380 

> 0 < 

0| |0 IntelliGenetics 

> 0 < 

FastDB - Fast Pairuise Comparison of Sequences 
Release 5.4 

Results file 2ppat.res made by on Fri 24 Mar 95 7! 56:28-PST. 



Query sequence being compared'.US-08-3Q0-510-2 (1-27) 
Number of sequences searched! - 50375 
Number of scores above cutoff; 400/ 

Results of the initial comparison of US-08-300-510-2 (1-27) with: 
Data bank ? A-GeneSeq 17, all entries 



100000- 
N 

U50Q00- 

M 

6 

E - « 

R 

* 

0 

F10000- 

S - * * 

E 5000- 

Q 

U 

E 

N 

C 

E 

S 1000- 

- » 
500- 



100- 



* 

lo- 



ll I I I! II II II I I I I 

SCORE 0 | 3 | 61 9| 12 15 18 21 24 27 

STDEV 0 2 3 5 6 8 



PARAMETERS 



Similarity matrix 


Unitary 


K-tuple 




Mismatch penalty 


i 


Joining penalty 




Gap penalty 


1.00 


Window size 




Gap size penalty 


0,05 






Cutoff score 


0 






Randomization group 


0 






Initial scores to save 40 


Alignments to save 


15 


Optimized scores to 


save 0 


Display content 


100 



2 
20 
27 



Scores ; 



Times ? 



SEARCH STATISTICS 

Mean Median 
2 3 

CPU 
00:00:21.01 



Standard Deviation 
1.64 

Total Elapsed 
00:00:22.00 



Number of residues: 6065180 
Number of sequences searched: 50375 
Number of scores above cutoff: 4007 

Cut-off raised to 2. 
Cut-off raised to 3. 
Cut-off raised to 4. 
Cut-off raided to 5. 
Cut-off raised to 6. 

The scores belou are sorted by initial score. 
Significance is calculated based on initial score. 

2 1007. identical sequences to the query sequence uere found: 
Sequence Name Description 



Init. Opt. 
Length Score Score Sig. Frame 



1. R41976 Human T cell reactive feline 27 27 27 15.21 0 

2. R36543 Peptide Y. 27 27 27 15.21 0 



9 100X similar sequences to the query sequence were found: 



Sequence Mane Description 



Init. Opt. 
Length Score Score Sig. Frane 



70 


97 
C f 


27 






OL 
TO 


97 
C f 


91 


15.21 


0 


OA 
70 


97 


27 


15.21 


0 


O A 
7H 


97 
C t 


27 


15.21 


0 


O A 
7 4 


97 


27 


15.21 


0 


92 


27 


27 


15.21 


0 


92 


27 


27 


15.21 


0 


88 


27 


27 


15.21 


0 


88 


27 


27 


15.21 


0 




Init . 


Opt. 







R12120 
R2736S 
R36548 
R27367 
R12119 

8. R36539 

9. R41983 

10. R36540 

11. R41984 



3. 
4. 
5. 

6. 

7, 



TRFP chain 1 with leader B. 
TRFF Chain ft! with CI leader 
Reconbitope YZX. 
TRFP Chain #1 uith CI leader 
TRFP chain 1 with leader A. 

(with Leader A) . 
reactive feline 
(with Leader B) . 
reactive feline 



TRFP chain 1 
Hunan T cell 
TRFP chain 1 
Hunan T cell 



The list of other best scores is; 



Sequence Naeie Description 



Length Score Score Sig. Frane 
above mean 



ass* 4 standard deviations 

12. R13139 B. burgdorferi strain PKo plOO 

13*. R44489 Sequence of all or part of a 

3 standard deviations a! 

14. R15140 Vascular injury affinity pept 

15. R15127 Vascular injury affinity pept 

16. R38912 Reconbitope XZY. 

17. P30230 Sequence of interferon IFN-al 

18. R07678 IFN-alpha 61. 

19. R47338 Peptide fragment of tetracycl 
2o". R55342 Sequence of rabbit alpha-1 ac 
2l! R37299 Plant type 1 RIP Saporin 6. 

22. R43955 Saporin from clone H13 np!8-G 

23. R43954 Saporin frofl clone M13 npi8-G 
24*. R43953 Saporin from clone 1113 npl8-G 

25. R43952 Saporin fron clone H13 npl8-G 

26. R43951 Saporin fron clone H13 np!8-G 

27. R26995 Human IGFBP-5. 

28. P90957 Ribosonal inactivating protei 
29 ! R51233 Heat resistant alkali proteas 
30. R49248 Actin. 

31- R22096 Actin. 

32. R22026 A. chrysogenun actin. 

33. R25276 SCC antigen. 

34. R43958 Saporin/FGF fusion protein. 
35". R43957 Saporin/FGF fusion protein, 
36. P60230 Dihydroxyacetone-synthetase. 
37°. R11607 Recombinant dihydroxyacetone 

38. R49507 Hunan LIF-R clone 65. 

39. R25069 nLIF-R. 

40. R52027 Protein uith Oxetanocm-A pro 



663 


9 


9 


4.26 


0 


1429 


9 


9 


4.26 


0 


>ve nean 










18 


8 


8 


3.65 


0 


18 


8 


8 


3.65 


0 


35 


8 


12 


3.65 


0 


189 


8 


9 


3.65 


0 


189 


8 


9 


3.65 


0 


194 


8 


9 


3.65 


0 


201 


8 


9 


3.65 


0 


259 


8 


9 


3.65 


0 


268 


8 


9 


3.65 


0 


268 


8 


9 


3.65 


0 


268 


8 


9 


3.65 


0 


268 


8 


9 


3.65 


0 


268 


8 


9 


3.65 


0 


272 


8 


10 


3.65 


0 


280 


8 


9 


3.65 


0 


361 


8 


9 


3.65 


0 


375 


8 


9 


3.65 


0 


375 


8 


9 


3.65 


0 


375 


8 


9 


3.65 


0 


390 


8 


10 


3.65 


0 


410 


8 


9 


3.65 


0 


410 


8 


9 


3.65 


0 


702 


8 


9 


3.65 


0 


702 


8 


9 


3.65 


0 


719 


8 


9 


3.65 


0 


719 


8 


9 


3.65 


0 


744 


8 


8 


3.65 


0 



1. US-08-300-510-2 (l-27> . 

R41976 Hunan T cell reactive feline protein fragment Y. 

ID R41976 standard? peptide? 27 AA. 

AC R41976! 

DT 21-APR-1994 (first entry) 

DE Hunan T cell reactive feline protein fragment Y. aBM «nJ 

KW Human? T cell? reactive? feline? protein? immune ^TP'm,^ 1!!? ' 

KW tolerance? mammal? Dermatophagoides ? Fells? Ambrosia! Lol urn. Cams, 

m Cryptomeria? Alternaria? Alder? Betula? Que ^ us \ 0le % A '^I n ss 

KM Plantago? Parietaria? Blattellai Apis? Periplaneta? autoantigen, ss. 

OS Homo sapiens. 

PN W09319178-A. 

PD 30-SEP-1993. 



PR 
PR 
PR 
PA 
PI 
PI 
DR 
PT 
PT 
PT 
PS 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

SQ 
SQ 
SQ 



25-HAR-1992? US-857311. 
15-MAY-1992? US-884718. 
15-JAN-1993? US-006116. 
(IHHU-) IMMUNOLOGIC PHARM CORP. 

Briner TJr Carman RD* Gefter ML r Greenstein JL? 
Kuo Mr Morville H; 
feJPI? 93-320744/40. 

New peptide(s) for inducing tolerance - comprise one or more 
epitope(s) of an allergen administered subcutaneous ly r for 
treating sensitivity to cats* bees* etc. 
Claim 1? Fig 3; 107pp'r English. 

The sequences given in R41975-82 are peptides derived from a human T 
cell reactive feline protein. These peptides are used in a 
therapeutic composition yhich is useful in treating diseases which 
involve an immune response to a protein antigen. This composition 
may be used to induce tolerance in a mammal to Dermatophagoides t 
Felisr Ambrosia, Loliumr Cryptomeriar Alternariar Alder* Betula* 
6uercus> Qleai Artemesiar Plantagor Parietaria f Canisr Blattellar 
ApiSr Periplaneta and to autoantigens in humans. 
Sequence 27 AA? 

3 A? 1 R? 2 M ? 2 D? 0 B ? 1 C; 0 Q? 4 E? 0 Z? 0 G? 0 H? 
1 I? 3 LJ 4 K? 1 0 F? 1 P? 0 S? 1 T; 0 W? 0 Y? 3 V? 



Initial 
Res idue 
Gaps 



Score 
Identity 



27 Optimized Score = 27 
1007. hatches = 27 

0 Conservative Substitutions 



Significance 
Mismatches 



15.21 
0 
0 



X 10 20 X 

KALPVVLENARILKNCVDAKHTEEDKE 

1 1 1 1 1 i I i I i 1 1 1 1 1 1 f 1 1 E i 1 1 1 1 1 1 

KALPVVLENARILKNCVDAKHTEEDKE 
X 10 20 X 



i. US-0S-300-510-2 (1-27) 
R36543 Peptide Y. 

ID R36543 standard; Protein? 27 AA. 

AC R36543? 

DT 12-AUG-1993 (first entry) 

DE Peptide Y. 

KW Human T cell reactive feline protein? TRFP? epitope? recombitope. 

OS Felis. 

PN W09308280-A. 

PD 29-APR-1993. 

PF 16-QCT-1992? U08694. 

PR 16-QCT-1991? US-777859. 

PR 13-DEC-1991? US-807529. 

PA (IMMU-) IMMUL0GIC PHARM CORP. 

PI Bond JF, Garman RDr Kuo M» Horgenstern JPr Horville M? 

PI Rogers BL? 

DR HPI» 93-152473/18, 

PT Recombitope peptide having T-cell stimulating activity - for the 

PT diagnosis and treatment of sensitivity to protein allergensr 

PT autoiantigens and protein antigens 

PS Disclosure? Fig 4? 73pp? English. 

CC Chains 1 and 2 of the TRFP have been recombinant ly expressed in E. 

CC coli and purified, T cell epitope studies using overlapping peptide 

CC regions derived from the TRFP amino acids sequence were used to 

CC identify multiple T cell epitopes in each chain of TRFP. DWA 

CC constructs were assembled in which 3 regions (encoding peptides Xr 

CC Y and Z) were linked to produce DNA constructs encoding recombitope- 

CC peptides. 

SQ Sequence 27 AA? 

SQ 3 A? 1 R? 2 N? 2 D? 0 B? 1 C? 0 Q? 4 E? 0 Z? 0 G? 0 H? 



f^^^TT Li * k! i hi 0 rl I Pi u bi i iru w; u yj j v> 



Initial Score = 27 Optimized Score = 27 Significance = 15.21 
Residue Identity = 1007. Matches = 27 Mismatches = 0 

Q aps = 0 Conservative Substitutions = u 

X 10 20 X 

KALPVVLENARILKNCVDAKMTEEDKE 

MIMIlliiniliillllllMMI 
KALPVVLENARILKNCVDAKMTEEDKE 

X 10 20 X 



US-08-300-510-2 (1-27) 

R12120 TRFP chain 1 with leader B. 



ID 

AC 

DT 

DE 

KW 

OS 

FH 

FT 

FT 

FT 

FT 

PN 

PD 

PF 

PR 

PA 

PI 

PI 

DR 

DR 

PT 

PT 

PT 

PS 

CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
SQ 
SO 
SQ 



R12120 standard; Protein; 96 AA. 
R12120; 

26-JUL-1991 (first entry) 
TRFP chain 1 with leader B. 

Hunan T cell reactive feline protein; cat allergens. 
Felis catus. 

Key Location/Qualifiers 

Peptide 9. .26 

/label= Leader B 

Protein 27.. 96 

/label= TRFP Chain 1 

WQ9106571-A. 

16-MAY-1991. 

02- NOV-1990; U06548. 

03- N0V-1989; US-431565. 

( IMMU-) IMMULOGIC PHARM COR. 

Gefter MLr Garman RD, Greenstein JL, Juo M, Rogers BLJ 

Brauer AW; 

WPI; 91-164136/22. 

N-PSDB; Q11837. . 
Npu pure covalently linked human T cell reactive felxne protein - 
and modified peptide(s), used to reduce effects of cat allergens 
and to diagnose sensitivity to allergens. 
Claim 2; Fig l; 70pp; English. 

Poly-A rcRNA from cat parotid and mandibular glands was used to 
produce cDNA clones for both chain 1 and chain 2 of TRFP. These 
clones uere then used to screen a cat genomic library. Chain 1 
exists in tuo forms having different leader sequences (A and B) . 
The sequence can be used to express the protein and peptide derivs. 
which stimulate T-cells in persons allergic to cats. The peptides 
can be used to reduce/eliminate the allergic response partic. by 
modificn. of lynphokine prodn. by the T-cells. They can also be 
used to identify epitopes responsible for sensitivity. The DNA can 
be used to detect comparable sequence in other species, and also 
for prodn. of modified forms of TRFP esp. showing reduced binding 
to IgE and thus reduced tendency to cause adverse reactions. 
See also R121 19-R12123. 
Sequence 96 AA; 

12 a; 4 R; 3 N; 8 D; 0 B; 6 C; 2 Q; 7 E; o z; l g; 
3 I? li l; 7 k; 2 M i 1 F; 7 P; 3 s; 6 T ; 2 w; 3 Y; 



0 
8 



h; 
v; 



27 Optimized Score = 27 Significance 
1007. Matches = 27 Mismatches 

0 Conservative Substitutions 



15.21 
0 
0 



Initial Score = 
Residue Identity = 
Gaps = 



AMRCSHKRMLDAALPPCPTVAATADCEICPAVKRDVDLFLTGTPDEYVEQVAQYKALPVVLENARILKNCVD 

30 40 50 X 60 70 



X 10 
KALPVVLENARILKNCVD 



10 



20 



20 X 
AKMTEEDKE 

I i 1 1 1 1 M I 

AKMTEEDKENALSLLDKI YTSPLC 
80 90 



4. US-08-300-510-2 (i-27) 

R27368 TRFP Chain #1 with Ci leader B sequence. 



ID R27368 standard; protein? 96 AA. 

AC R27368? 

DT 25-FEB-1993 (first entry) 

DE TRFP Chain #1 with CI leader B sequence. 

KW T cell reactive feline protein? cat allergy? allergic? IgE? 

KW desensitizing; 

OS Felis domesticus. 

FH Key Locat ion/Qua I if iers 

FT Peptide 1. .27 

FT /label= Leader B 

FT Protein 28.. 96 

FT /label= TRFP chain #1 

PN W09215613-A. 

PD 17-SEP-1992. 

PF 2G-FEB-1992? U01344. 

PR 28-FEB-1991? US-662193. 

PA (IMMU-) IMMULOGIC PHARM CORP. 

PI Bond Jr Kuo M? 

DR HPI; 92-331670/40. 

PT Modified human T-cell reactive feline protein - stimulates T-cell 

PT in individuals allergic to cats and shows reduced 

PT histamine-releasing properties 

PS Claim 1? Fig 1? 35pp? English. 

CC This sequence represents a modified human T-cell reactive feline 

CC protein which stimulates T-cells from an individual who is allergic 

CC to catsr but which interacts with human IgE to a lesser extent than 

CC does affinity purified TRFP. The protein is modified by treating 

CC with either a mild alkali (pH 12.5-13.5 » KOHr NaOH, LiOH or tertiary 

CC amines) or an enzyme which removes 0-linked groups (carbohydrate 

CC moieties). It is useful in desensitising people who are allergic to cats. 

SQ Sequence 96 AA? 

SG 12 A? 4 R? 3 N? 8 D? 1 B? 6 C? 2 0? 7 E? 0 Z? 1 G? 0 H? 

SQ 3 IS 11 L ? 7 K? 2 11? 1 F? 7 P? 3 S? 6 T? 2 W? 3 Y ? 7 V? 



Initial Score = 27 Optimized Score = 27 Significance = 15.21 
Residue Identity = 1007. Matches = 27 Mismatches = 0 

Gaps = 0 Conservative Substitutions = 0 

X 10 
KALPVVLENARILKNCVD 



AWRCSWKRMLDAALPPCPTBAATADCEICPAVKRDVDLFLTGTPDEYVEQVAQYKALPVVLENARILKNCVD 
10 20 30 40 50 X 60 70 



20 X 
AKMTEEDKE 
MINIMI 

AKMTEEDKENALSLLDKI YTSPLC 
80 90 



5. US-08-300-510-2 (1-27) 

R36548 Recombitope YZX. 

ID R36548 standard? Protein? 96 AA. 



1*C Jf3"S"D*ttif 

DT 12-AUG-1993 (first entry) 

DE Recombitope YZX. 

KW Hunan T cell reactive feline protein; TRFP; epitope! recombitope 

KW sensitivity! Felis domesticus. 

OS Synthetic. 

FH Key Location/Qualifiers 

FT Cleavage. site 14. .15 

FT /label= thrombin_c leavage_s i te 

PN W093Q8280-A. 

PD 29-APR-1993, 

PF 16-GCT-1992! U08694. 

PR 16-0CT-1991! US-777859, 

PR 13-DEC-1991! US-807529. 

PA ( IMMU-) IMMULOGIC PHARM CORP. 

PI Bond JF. Garman RDf Kuo Mr Morgenstern JPr Morville Mr 

PI Rogers BL? 

DR WPIi 93-152473/18. 

DR N-PSDB! Q41572. 

PT Recombitope peptide having T-cell stimulating activity - for the 

PT diagnosis and treatment of sensitivity to protein allergens, 

PT auto; antigens and protein antigens 

PS Disclosure! Fig 8; 73pp? English. 

CC Preferred recombitope peptides for treating sensitivity to Feus 

CC domesticus are derived from the the genus Felis and comprise 

CC regions selected from peptides X, Yr Zr A and Br of TRFP r and 

CC modifications thereofr such as peptide C. 

CC Oligonucleotides Cr Dr Er Fr Gr H and I are used in the 

CC construction of recombitope peptide YZX. 

SQ Sequence 96 AA; 

SQ 8 AJ 4 Rf 5 N! 6 D) 0 B? 1 C! 2 Q! 10 E; 0 Z; 4 G! 6 Hr 

SQ 1 I! 12 L ! 7 K! 2 M! 4 F! 5 P; 2 S5 5 T? 0 W! 2 Y? 10 V? 

Initial Score = 27 Optimized Score = 27 Significance = 15.21 



Residue Identity = 1007. Matches 



27 Mismatches = 0 



Gaps 



0 Conservative Substitutions - 0 

X 10 20 X 

kalpvvlenarilkncvdakmteedke 
hghhhhhheflvprgskalW 

10 X 20 30 40 X 50 60 70 

VDLFLTGTPDEYVEQVAQYKALPV 
80 90 



6. US-08-300-510-2 (1-27) 

R27367 TRFP Chain #1 with CI leader A sequence. 

ID R27367 standard! protein! 94 AA. 

AC R27367? 

DT 25-FEB-1993 (first entry) 

DE TRFP Chain #1 with CI leader A sequence. 

KW T cell reactive feline protein. 

OS Felis domesticus. 

FH Key Location/Qualifiers 

FT Peptide l.*25 

FT /label= Leader A 

FT Protein 25.. 94 

FT /label= TRFP chain #1 

PN W09215613-A. 

PD 17-SEP-1992. 

PF 20-FEB-1992! U01344. 

PR 28-FEB-1991! US-662193. 

PA ( IMMU-) IMMULQGIC PHARM CORP. 



r * 

DR 

PT 

PT 

PT 

PS 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

S3 

SQ 

SQ 



WPI; 92-331670/40. 

Modified human T-cell reactive feline protein - stimulates T-cell 
in individuals allergic to cats and shows reduced 
histamine-releasing properties 
Claim 1? Fig 1; 35ppJ English. 

This sequence represents a modified human T-cell reactive feline 
protein which stimulates T-cells from an individual who is allergic 
to cats, but uhich interacts with human IgE to a lesser extent than 
does affinity purified TRFP. The protein is modified by treating 
uith either a mild alkali (pH 12.5-13.5 . KOH, NaOH, LiOH or tertiary 
amines) or an enzyme uhich removes 0-linked groups (carbohydrate 
moieties). It is useful in desensitising people uho are allergic to cats 
Sequence 94 AA; 

9 A! 3 R; 4 n; 6 D; o B ; 5 C; 2 QJ 7 E; 0 z; 4 G, 0 H, 
5 I? 15 LJ 7 KJ 2 H5 1 FJ 4 Pi 2 Sf 4 T? 2 W; 3 Y; 9 V. 



Initial 
Res idue 
Gaps 



Score = 
Identity = 



27 Optimized Score = 27 Significance 
100% Matches = 27 Mismatches 

0 Conservative Substitutions 



15.21 
0 
0 



X 10 20 

KALPVVLENARILKNCVDAK 



CIMKGARVLvLLMAALLLIWGGNCEICPAVKRDVDLFLTGTPDEYVEQVAQYKALPVVLENARILKNCVDAK 

40 50 X 60 70 



10 



20 



30 



MTEEDKE 
Mill!! 

MTEEDKENALSLLDK I YTSPLC 
80 90 



7. US-08-300-510-2 (1-27) 

R12119 TRFP chain 1 uith leader A. 

ID R12119 standard; Protein? 94 AA. 
AC R12119f 

DT 26-JUL-1991 (first entry) 
DE TRFP chain 1 uith leader A. 

KW Human T cell reactive feline protein; cat allergens. 
OS Felis catus. 

Key Location/Qualifiers 

FT Peptide 3. .24 

FT /label= Leader B 

FT Protein 25.-94 

FT /label= TRFP Chain 1 

PN W091G6571-A. 

PD 16-MAY-1991. 

PF 02-NOV-1990; U06548. 

PR 03-NQV-1989! US-431565. 

PA (IMMU-) IMMULOGIC PHARM COR. 

PI Gefter ML. Garman RD, Greenstein JL, Juo M, Rogers BLJ 

PI Brauer AW; 

DR WPI; 91-164136/22. 

DR N-PSDB; Q11836. 

PT New pure covalently linked human T cell reactive feline protein - 
PT and modified peptide(s). used to reduce effects of cat allergens 
PT and to diagnose sensitivity to allergens. 
PS Claim 2; Fig 1; 70pp; English. 

CC Poly-A nRNA from cat parotid and mandibular glands uas used to 
produce cDNA clones for both chain 1 and chain 2 of TRFP. These 



CC 
CC 
CC 
CC 



clones uere then used to screen a cat genomic library. Chain l 
exists in tuo forms having different leader sequences (A and 
The sequence can be used to express the protein and peptide derivs, 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
ss 

SQ 
SQ 



^^^^ 

can 



? t mutate i-ceits in perso ns anergic to cats, <ne peptides 
be used to reduce/eliminate the allergic response partic. by 
modificn. of lynphokine prodn. by the T-cells They J e 
used to identify epitopes responsible for sensitivity. The DMA can 
be used to detect comparable sequence in other spec les . and also 
for prodn. of nodified forms of TRFP esp. shouing reduced binding 
to IgE and thus reduced tendency to cause 
See also R12120-R12123. 
Sequence 94 AA; 

9 Af 3 RJ 4 N; 6 DJ 0 B; 5 C; 2 «» 
5 I j 15 LJ 7 KJ 2 «; 1 F*» « P; 2 S} 



adverse reactions. 



7 
4 



E'r 

T; 



0 
2 



ZI 

w; 



4 
3 



g; 
y; 



0 

9 



h; 
v; 



Initial 
Res idue 
Gaps 



Score = 27 Optimized Score = 27 
Identity = 100% Matches = 27 

0 Conservative Substitutions 



Significance = 
Mismatches = 



15 



.21 
0 
0 



X 10 20 

KALPVVLENARILKNCVDAK 



CIMKGARVLVLLWAALLLIWGGNCEICPAVKRDVDLFLTGTPDEYVE6VAQYKALPVVLENARILKNCVDAK 



10 



20 



30 



40 



MTEEDKE 



MTEEDKENALSLLDKI YTSPLC 
80 90 



8. US-08-300-510-2 (1-27) 

R36539 TRFP chain 1 (with Leader A) 



ID 

AC 

DT 

DE 

KW 

KW 

OS 

FH 

FT 

FT 

PN 

PD 

PF 

PR 

PR 

PA 

PI 

PI 

DR 

DR 

PT 

PT 

PT 

PS 

CC 

CC 

CC 

CC 

SQ 

SO 

SQ 



R36539 standard; Protein; 92 AA. 
R36539; 

12- AUG-1993 (first entry) 
TRFP chain I (with Leader A). 

Hunan T cell reactive feline protein; 
epitope. 
Felis. 
Key 

Peptide 
/label= 
W09308280-A. 
29-APR-1993. 
U-OCT-1992; 
1&-0CT-1991; 

13- DEC-1991 ? 



TRFP f leader A; leader Bi 



Location/Qual if iers 
1..22 
leader_peptide 



Morville M! 



U08694. 
US-777859. 
US-807529. 
(IMMU-) IMMUL0GIC PHARM CORP. 
Bond JF. Garman RD, Kuo M, Morgenstern JPr 
Rogers BL; 
WPI; 93-152473/18. 

";c 8 o"Iuii 5 "ptid. having T-cell stimulating activity - for the 
diagnosis anS treatment of sensitivity to protein allergens, 
auto: antigens and protein antigens 

Ch:inrrind F 2^f ; the PP TRFp n9 hlv: h been recombinant^ expressed in E 
co an urified. T cell epitope studies using overlapping peptide 
regions derived from the TRFP amino acids sequence uere used to 
identify multiple T cell epitopes in each chain of TRFP. 

iTTr.'rS;. ... ..4 ... .,7 ... z. j ... 

4 I t 15 L r 7 Ki 2 HJ 1 F; 4 P? 2 S; 4 TJ 2 U. 3 Y, 9 V, 



Initial Score 
Residue Identity 



27 Optimized Score 
IO0X Matches 



27 Significance = 15.21 
27 Mismatches = 0 



uaps 



u conservative auosLitunotis 



X 10 20 

K ALP VVLENAR ILKNCVDAKMT 

Mil II Ml III! IMMMII 

MKGARVL VLLWAALLL I WGGNCE I CP A VKRDVDLFLTGTPDEYVEQVAQYKALPVVLENAR ILKNCVDAKMT 
10 20 30 40 50 



60 



70 



X 

EEDKE 
IMM 

EEDKENALSLLDKI YTSPLC 
X 80 90 



US-OS-300-510-2 (1-27) 

R41983 Hunan T cell reactive feline protein A chain 1 



ID 

AC 

DT 

DE 

KW 

KW 

KM 

KM 

OS 

FH 

FT 

FT 

FT 

FT 

PN 

PD 

PF 

PR 

PR 

PR 

PA 

PI 

PI 

DR 

DR 

PT 

PT 

PT 

PS 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

SQ 

SQ 

SQ 



R41983 standard? Protein! 92 AA. 
R419835 

21-APR-1994 (first entry) 

Hunan T cell reactive feline protein A chain 1. 

Human; T cell? reactive? feline? protein? immune response? antigen? 
tolerance; mammal? Dermatophagoides ? Felis? Ambrosia? Lolium? Cams? 
Cryptomeria? Alternaria? Alder; Betula; Quercus? Olea? Artemesia? 
Plantago? Parietaria? Blattella? Apis? Periplaneta? autoantigen. 
Homo sapiens. 

Location/Qualifiers 
1..22 
"Signal peptide" 
23. .92 
"Mature protein" 



Key 

Peptide 
/note= 
Protein 
/note= 
W09319178-A. 
30-SEP-1993. 
25-MAR-1993? 
25-MAR-1992? 
15-MAY-1992? 
15-JAN-1993? 



Greenstein JL, Kuo M; 



U02462. 
US-857311. 
US-884718. 
US-006116. 
(IMMU-) IMMUNOLOGIC PHARM CORP. 
Briner TJ. Garman RDr Gefter ML t 
Morville M? 
WPI? 93-320744/40. 
N-PSDB? Q49533. 

New peptide(s) for inducing tolerance - comprise one or more 
epitope(s) of an allergen administered subcutaneous ly , for 
treating sensitivity to catsr bees* etc. 
Disclosure; Fig 1? 107pp; English. 

The sequences given in R41983-84 represent chain 1 of human T cell 
reactive feline proteins (TRFP) A and B respectively. Peptides 
derived from TRFP may be used in a therapeutic composition which is 
useful in treating diseases which involve an immune response to a 
protein antigen. This composition may be used to induce tolerance 
in a mammal to Dermatophagoides, Felis, Ambrosia, Lolium, Cryptomeria, 
Alternaria. Alder, Betula, Quercus, Olea, Artemesia, Plantago, 
Parietaria, Canis, Blattella, Apis, Periplaneta and to autoantigens 
in humans. 

Sequence 92 AA? . 
9 A; 3 RJ 4 N? 6 D? 0 B! 4 C? 2 Q? 7 E? 0 Z? 4 G? 0 H? 
4 I? 15 L? 7 K ? 2 M? 1 F? 4 P? 2 S? 4 T? 2 H? 3 Y? 9 V? 



Initial Score 
Residue Identity 
Gaps 



27 Optimized Score = 27 Significance = 15.21 
100% Matches = 27 Mismatches = 0 

0 Conservative Substitutions = 0 



X 10 20 

K ALP VVLENAR ILKNCVDAKMT 



1 j . J i . i J i 1 1 . J I i I i . 'J I I 

HKGARVLVLLHAALLLIUCGNCEICPAVKRDVDLFLTCTPDEYVEeVAaYKALPVVLEMARILKNCVDAKHT 

30 40 50 60 70 



10 



20 



X 

EEDKE 
lllll 

EEDKENALSLLDKIYTSPLC 
X 80 90 



10. US-08-300-510-2 (1-27) 

R36540 TRFP chain 1 (uith Leader B) . 

R36540 standard; Protein; 88 AA. 
R36540; 

12-AUG-1993 (first entry) 
TRFP chain 1 (uith Leader B) . 
Hunan T cell reactive feline protein; TRFP; 



ID 
AC 
DT 
DE 
KM 
KM 
OS 
FH 
FT 
FT 
PN 
PD 
PF 
PR 
PR 
PA 
PI 
PI 
DR 
DR 
PT 
PT 
PT 
PS 
CC 
CC 
CC 
CC 
SO 
SO 
SO 



leader A; leader B; 



epitope, 
Fel is . 
Key 

Peptide 



Location/Qual if iers 
I. .18 

/label= leader_peptide 
W0930828Q-A. 
29-APR-1993. 
16-0CT-1992; U08694. 
16-QCT-1991; US-777859. 
13-DEC-1991; US-807529. 
(IHMU-) IMMULOGIC PHARfi CORP. 

Bond JF, Carman RD- Kuo M > Horgenstern JP» Morville M; 

Rogers BL; 

HPI; 93-152473/18. 

N-PSDBI 941557. 

Recombitope peptide having T-cell stimulating activity - for the 
diagnosis and treatment of sensitivity to protein allergens, 
auto; antigens and protein antigens 
Disclosure; Fig 1: 73pp; English. 

Chains 1 and 2 of the TRFP have been recombinantly expressed in t . 
coli and purified. T cell epitope studies using overlapping peptide 
regions derived from the TRFP amino acids sequence uere used to 
identify multiple T cell epitopes in each chain of TRFP. 
Sequence 88 AA; 

11 A; 2 R; 3 n; 8 d; 0 b; 5 c; 2 Q; 7 E; o z; 
3 i; ll L; 6 k; 2 H ; l F; 7 P; 2 s; 6 T; 0 w; 



G; 
y? 



h; 
v; 



27 Optimized Score = 27 
1007. Matches = 27 

0 Conservative Substitutions 



Significance 
Mismatches 



15.21 
0 
0 



Initial Score = 
Residue Identity = 
Gaps = 



MLDAALPPCPTVAATADCEICPAVKRDVDLFLTGTPDEYVEQVASYKALPVVLENARILKNCVDAKMTEEDK 

40 X 50 60 70 



X 10 20 

KALPVVLENARILKNCVDAKMTEEDK 



10 



20 



30 



ENALSLLDKI YTSPLC 
X 80 



11. US-08-300-510-2 (1-27) 

R41984 Hunan T cell reactive feline protein B chain 1. 



AC R41984? 

DT 21-APR-1994 (first entry) 

DE Human T cell reactive feline protein B chain 1. 

KW Hunan; T cell? reactive? feline? protein? immune response? antigen, 

KW tolerance; mammal; Dermatophagoides ; Felis; Ambrosia; Lolium; Canis; 

KW Cryptomeria; Alternaria; Alder; Betula; Suercus? Olea? Artemesia, 

KW Plantago? Parietaria? Blattella? Apis? Periplaneta? autoantigen. 

OS Homo sapiens. 

PH Key Location/Qualifiers 

FT Peptide 1..17 

FT /note= "Signal peptide" 

FT Protein 18.. 88 

FT /note= "Mature protein" 

PN W0931917B-A. 

PD 30-SEP-1993. 

PF 25-MAR-1993? U02462. 

PR 25-MAR-1992? US-857311. 

PR 15-MAY-1992; US-884718. 

PR 15-JAN-1993; US-006116. 

PA ( IMMU-) IMMUNOLOGIC PHARM CORP. 

PI Briner TJ, Garman RD r Gefter ML, Greenstein JLr Kuo M? 

PI Morvitle M; 

DR WPI; 93-320744/40. 

DR N-PSDB? Q49534. 

PT New peptide(s) for inducing tolerance - comprise one or more 
PT epitope(s) of an allergen administered subcutaneously , for 
FT treating sensitivity to cats, bees, etc. 
PS Disclosure; Fig 15 107pp; English. 

CC The sequences given in R41983-84 represent chain 1 of human T cell 
rc reactive feline proteins (TRFP) A and B respectively. Peptides 
CC derived from TRFP may be used in a therapeutic composition which is 
CC useful in treating diseases which involve an immune response to a 
CC protein antigen. This composition may be used to induce tolerance 
CC in a mammal to Dermatophagoides, Felis, Ambrosia, Lolium, Cryptomeria, 



Alternaria, Alder, Betula, Quercus, Olea, Artemesia, Plantago, 
Parietaria, Canis, Blattella, Apis, Periplaneta and to autoantigens 



CC 
CC 

CC in humans 
SO Sequence 88 AA; 



H R? 3 N 8 D. 0 B5 5 C. 2 •> 7 E5 0 Z 5 1 G5 0 H5 

so 3 i; it l; 6 k; 2 n; l f? i p» 2 s? 6 T? o w? 3 y? 8 vi 

Initial Score * 27 Optimized Score - 27 Significance = 15.21 

Residue Identity = 1007. 



Gaps 



0 



Matches = 27 Mismatches = 0 

Conservative Substitutions = 0 

X 10 20 

KALPVVLENARILKNCVDAKMTEEDK 



MLDAALPPCPTVAATADCEICPAVKRDVDLFLTGTPDEYVE6VAQYKALPVVLENARILKNCVDAKMTEEDK 
10 20 30 40 X 50 60 70 

X 
E 
I 

ENALSLLDKI YTSPLC 
X 80 



12. US-08-300-510-2 (1-27) 

R13139 B. burgdorferi strain PKo plOO gene. 

ID R13139 standard; Protein; 663 AA. 

AC R13139; 

DT 27-SEP-1991 (first entry) 

DE B. burgdorferi strain PKo plOO gene. 



lyrie borretiosisi vaccTne? ft age tun* ss, 

OS Borrelia burgdorferi. 

PN W0910987G-A. 

PD il-JUL-1991. 

PF 21-DEC-1990? E02282. 

PR 22-DEC-1989J DE-942728. 

PR 13-JUN-1990? DE-018988. 

PA (HIKR-) HIKROGEN MOLEKULARB. 

PI Fuchs R i Wilske Bi Preac-Hurs ic Vr Hoiz H, Soutschek E. 

DR HPI5 91-222844/30. 

PT Neu Borrelia burgdorferi proteins - useful as immunoassay 

pj rea gents and antigens for vaccine prodn. 

PS Claim 11? Page 49? 68ppJ German. 

CC Protein plOO uas isolated from a Burgdorferi cell lysate and the N- 

CC terminal amino acid sequence was determined. A probe pool was 

CC synthesised and used to screen a B.burgdorf eri cDNA library. A clone 

CC contg. the 5' 346 nucleotides of the plOO coding sequence was 

CC identified and sequenced. Cloning the entire gene allowed the plOO 

CC amino acid sequence to be deduced. 

CC See also Gi2744-G12747r Q13297-8 and R13140-R13142. 

SQ Sequence 663 AAi 
SQ 



SQ 



ucuu^nu t www ..... 

28 A; 19 R; 46 Ni 64 D; 0 B "> 0 C; 29 Q? 65 E; 0 Z; 21 G; 4 H, 

52 n 70 l; 79 K; 4 M; 25 f; 16 p; 59 s; 22 T ; 2 w; 19 Y; 39 v; 



Initial Score = 9 Optimized Score = 9 Significance = 4.26 

Residue Identity = 337. Matches = 9 Mismatches = 18 

Gaps = 0 Conservative Substitutions = 0 

* DLDKASQKLDFAEDNLDIQRDTVREKLQENINETNKEKNLPKPGDVSSPKVDKQLQIKESLEDLQEQLKEA 
300 310 320 330 340 350 360 370 

X 10 20 X 

KALPVVLENARILKNCVDAKMTEEDKE 

III I Mill 
SDEN8KREIEKQIEIKKNDEELFKNKDHKALDLKQELNSKASSKEKIEGEEEDKELDSKKNLEPVSEADKVD 

38 0 390 400 410 420 X 430 440 

KISKSNNNEVSKLSPLDEPSYSDIDSKEGVDNKDVDLQKTKPQVESQPTSLNEDLIDVSIDSSNPVFLEVID 
450 460 470 480 490 500 510 

PITNLGTLQLI 
520 



13. US-08-300-510-2 (1-27) 

R44489 Sequence of all or part of a namnalian calnodul m- 

ID R44489 standard; Protein; 1429 AA. 

AC R44489; 

DT 19-JUN-1994 (first entry) 

DE Sequence of all or part of a fianmal ian calnodul in-dependent 

DE nitric oxide synthase (NQS) . 

KW Calnodulin-dependent nitric oxide synthase; NQS; 

KW innunohistochettical reagent; antibody; assay. 

OS Hono sapiens. 

PN US5268465-A. 

PD 07-DEC-1993. 

PF 18-JAN-1991; 642002. 

PR 18-JAN-1991J US-642002. 

PA (UYJ0 ) UNIV JOHNS HOPKINS. 

PI Bredt DS. Snyder SH; 

DR WPIJ 93-404061/50. 

DR N-PSDB; 653403. 

PT DNA encoding nanna I ian » calnodulin-dependent nitric oxide 

PT synthase - used to raise antibodies which localise N0S in th« 

PT bodyt useful as ifinuno;hi stochenical reagent 



~F5 uTscTosure; columns ctipp, tngiisn. 

TC Degenerate oligonucleotide (OG) primers of 21 nucleotides were 
CC constructed, based on the seven AAs at the carboxyl and amino 
CC termini of each of the iuo longest trypsin peptides of NOS enzyme 
CC (18 and 17 AAs). These OGs were used in a PCR reaction to construct 
tuo non-degenerate OG primers. The two non-degerate primers were 



CC L'JU nun ucytnv. r ■ - - - . , ^ 

CC used in a further PCR reaction to obtain a larger polynucleotide 

CC probe. A 600 bp amplified prod, uas obtd. and random prime-labelled 

CC with (32) P-ATP to screen a commercially obtd. rat brain cDNA 

CC library. Eight overlapping independent clones were isolated and 

CC sequenced by ds dideoxy sequencing. A 4 kb ORF encoding a 150 kD 

CC protein (corresp. to the mol. ut. on NOS) was revealed. A flavin- 

CC binding consensus sequence uas observed in the AA sequence. 

SQ Sequence 1429 AAf < u , 

SS 87 A; 80 R; 61 N5 82 Df 0 B; 24 C; 68 Q; 95 E; 0 Z! 101Gf 41 Hr 

SQ 68 I; 124L; 90 Ki 29 M; 59 FJ 81 P; 100S; 78 T; 21 Ui 41 Y; 99 V; 

Initial Score = 9 Optimized Score = 9 Significance = 4.26 

Residue Identity = 33% Matches = ? Mismatches = 18 

= 0 Conservative Substitutions - u 



Gaps 



GDDVNIEKPNNSLISNDRSWKRNKFRLTYVAEAPDLT9GLSNVHKKRVSAARLLSRQNLQSPKFSRSTIFVR 
950 960 970 980 990 1000 1010 

X 10 20 X 

KALPVVLENARILKNCVDAKMTEEDKE 

LHTNGNQELSYQPGDHLGVFPGNHEDLVNALIERLEDAPPANHVVKVEMLEERNTALGVISNWKDESRLPPC 
l02 0 1030 1040 X 1050 1060 1070 1080 

TIFQAFKYYLDITTPPTPL6L9QFASLATNEKEKQRLLVLSKGLQEYEEWKMGKNPTMVEVLEEFPSIQMPA 
1090 1100 IUO 1120 1130 1140 1150 1160 

TLLLTQLSLL3 
1170 



14. US-08-300-510-2 (1-27) 

R15140 Vascular injury affinity peptide. 

ID R15140 standard; Protein; 18 AA. 

AC R15140! 

DT 18-FEB-1992 (first entry) 

DE Vascular injury affinity peptide. 

KM Lou density lipoprotein; atherosclerosis. 

OS Synthetic. 

PN W09116919-A. 

PD 14-N0V-1991. 

PF 02-MAY-1991; U03026. 

PR 03-MAY-1990; US-518215. 

PR 03-MAY-1990; US-518142. 

PA (NEWE-) NEW ENGLAND DEACON. 

PI Lees RS, Lees AH, Fischman A, Shih IL. Findeis MA; 

DR MPI; 91-353525/48. . 

PT Synthetic peptide(s) comprising amphiphilic domain of apoA-i 

PT used to diagnose vascular injury or disease or inhibit binding of 

PT lou density lipoprotein to vascular walls in treating 

PT atherosclerosis 

PS Disclosure; Page 8; 66pp; English. 

CC The amino acid sequence is that of a synthetic peptide (opt. labelled) 

CC ahich is used to detect injuries in the vascular system, esp. athero- 

CC sclerosis in its early stages before it causes stenosis and blood 

CC flou disturbances. It can also be used to inhibit binding of lou 

CC density lipoprotein (LDL) to vascular walls, i.e. to prevent or 
CC alleviate atherosclerosis. It is easy to prepare on a large scale 
CC and allows vascular regions to be located non-invasively without 



SQ 
SQ 
SQ 



Loetptex equipment, ot* niymy i ieo personnel, see dibO ai j *cu-irt *u* ji 
Sequence 18 AAl 

6 A; 1 R; l N; 0 D; 0 B; 0 C; 0 G; 2 E; 0 z; i G;0 H; 

o n 4 l? 2 k; o h? o f; o pj o s; o t; o u; i y? o v; 



Initial Score ~ 
Residue Identity = 
Gaps = 



8 Optimized Score = 8 Significance = 3,65 

447. hatches = 8 Mismatches = 10 

0 Conservative Substitutions = 0 



X 10 20 

KALPVVLENARILKNCVDAKMTEEDKE 

II II 1 I II 
YKLALEAARLLANAEGAK 
X 10 X 



15. US-08-300-510-2 (1-27) 

R15127 Vascular injury affinity peptide. 



ID R15127 standard; Protein; 18 AA, 

AC R15127; 

DT 18-FEB-1992 (first entry) 

DE Vascular injury affinity peptide. 

KW Lou density lipoprotein; atherosclerosis. 

OS Synthetic. 

PN WQ9116919-A. 

PD 14-N0V-1991. 

PF 02-MAY-1991; U03026. 

PR 03-MAY-1990; US-518215. 

PR 03-MAY-1990; US-518142. 

PA (NEHE-) HEU ENGLAND DEACON. 

PI Lees RSi Lees AM ? Fischman At Shih ILr Findeis MA; 

DR MPI; 91-353525/48. 

PT Synthetic peptide(s) comprising amphiphilic domain of apoA-I - 

PT used to diagnose vascular injury or disease or inhibit binding of 

PT low density lipoprotein to vascular walls in treating 

PT atherosc leros is 

PS Claim 1; Page 44; 66pp; English. 

CC The amino acid sequence is that of a synthetic peptide (opt. labelled) 

CC uhich is used to detect injuries in the vascular systemr esp. athero- 

CC sclerosis in its early stages before it causes stenosis and blood 

CC flow disturbances. It can also be used to inhibit binding of low 

CC density lipoprotein (LDL) to vascular uallsr i.e. to prevent or 

CC alleviate atherosclerosis. It is easy to prepare on a large scale 

CC and allows vascular regions to be located non-invas ively without 

CC complex equipment or highly skilled personnel. See also R15126-R15140 . 

SQ Sequence 18 AA; 

SQ 6 A; l r; 1 M; o D; 0 b; 0 c; 0 Q; 2 E; 0 z; l G; 0 H; 

SQ 0 I? 4 L;2 K;0 m;o f;o p;0 s;0 T;0 h; i Y;0 v; 



Initial Score = 8 Optimised Score = 8 Significance = 3.65 

Residue Identity = 44% Matches = 8 Mismatches = 10 

Gaps = 0 Conservative Substitutions = 0 



X 10 20 

KALPVVLENARILKNCVDAKMTEEDKE 

it II I I II 
YKLALEAARLLANAEGAK 



X 10 X 
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L2 ANSWER 1 OF 13 REGISTRY COPYRIGHT 1995 ACS 

RN 152416-30-3 REGISTRY 

CN Allergen Fel d I (Felis catus) (9CI) (CA INDEX NAME) 

FS PROTEIN SEQUENCE 

MF Unspecified 

CI MAN 

SR CA 



*** 



*** STRUCTURE DIAGRAM IS NOT AVAILABLE *** 
*** USE ' SQD ' OR 'SQIDE' FORMATS TO DISPLAY SEQUENCE 
1 REFERENCES IN FILE CA (1967 TO DATE) 

REFERENCE 1: 120:75353 

L2 ANSWER 2 OF 13 REGISTRY COPYRIGHT 1995 ACS 
RN 149119-99-3 REGISTRY , 

CN 47-73 -Glycoprotein TRFP (Felis catus chain 1 isoform B protein 
moiety reduced), N- (L-methionylglycyl-L-hist idyl -L- hist idyl- L- 
histidyl-L-histidyl-L-histidyl-L-histidyl-L- .alpha. -glutamyl-L- 
phenylalanyl-L-leucyl-L-valyl-L-prolyl-L-arginylglycyl-L-seryl)- 
(73 fwdarw. 14') -protein with 14-39-allergen Fel dl (Felis catus 
chain 2 protein moiety reduced) (39 '. fwdarw. 25 ") -protein with 
25-51-glycoprotein TRFP (Felis catus chain 1 isoform B protein 
moiety reduced) (9CI) (CA INDEX NAME) 

FS PROTEIN SEQUENCE 

MF Unspecified 

CI MAN 

SR CA 

*** STRUCTURE DIAGRAM IS NOT AVAILABLE *** 
*** USE 'SQD' OR 'SQIDE' FORMATS TO DISPLAY SEQUENCE *** 
1 REFERENCES IN FILE CA (1967 TO DATE) 
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REFERENCE 1: P 119:93527 

L2 ANSWER 3 OF 13 REGISTRY COPYRIGHT 1995 ACS 
RN 149119-95-9 REGISTRY . 

CN 69-87-Glycoprotein TRFP (Felis catus chain 1 isoform B protein 

moiety reduced), (87 . fwdarw. 47 •) -protein with 47-73 -glycoprotein 
TRFP (Felis catus chain 1 isoform B protein moiety reduced) 
(73> fwdarw. ?4") -protein with 14-39-allergen Fel dl (Felis catus 
chain 2 protein moiety reduced) (39" . fwdarw. 25 ■ • •) -protein with 
25-51-glycoprotein TRFP (Felis catus chain 1 isoform B protein 
moiety 9 reduced) (51 " ■ . fwdarw. 74 • ; ' ■ ) -protein with 74-92-allergen^ 
Fel dl (Felis catus chain 2 protein moiety reduced) (9CI) (CA INDEX 
NAME) 

FS PROTEIN SEQUENCE 
MF Unspecified 
CI MAN 
SR CA 

*** STRUCTURE DIAGRAM IS NOT AVAILABLE *** 
*** USE 1 SQD' OR ' SQIDE 1 FORMATS TO DISPLAY SEQUENCE *** 
1 REFERENCES IN FILE CA (1967 TO DATE) 

REFERENCE 1: P 119:93527 

L2 ANSWER 4 OF 13 REGISTRY COPYRIGHT 1995 ACS 

RN 149119-94-8 REGISTRY . „ ' . 

CN 47 -73 -Glycoprotein TRFP (Felis catus chain 1 isoform B protein 
° moiety reduced), (73 . fwdarw. 14 ■) -protein with 14 - 3 9-allerg e n Fel dl 
(Felis catus chain 2 protein moiety reduced) (39 '. fwdarw 25 ) 

protein with 25-51-glycoprotein TRFP (Felis catus chain 1 isoform B 

protein moiety reduced) (9CI) (CA INDEX NAME) 
FS PROTEIN SEQUENCE 
MF Unspecified 
CI MAN 
SR CA 

*** STRUCTURE DIAGRAM IS NOT AVAILABLE *** 
*** USE 'SQD' OR 'SQIDE' FORMATS TO DISPLAY SEQUENCE *** 
1 REFERENCES IN FILE CA (1967 TO DATE) 

REFERENCE 1: P 119:93527 

L2 ANSWER 5 OF 13 REGISTRY COPYRIGHT 1995 ACS 

S Glycoprotein SS^li. catus chain 1 isofo™ B protein m oiety 

reduced) (9CI) (CA INDEX NAME) 
OTHER NAMES 
CN 



CN 



Allergen Fel dl (Felis catus chain 1 isoform B precursor protein 

moiety reduced) , , . - /^^^\ 

Leader B-human T cell-reactive feline protein chain 1 (cat) 



/ 
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FS PROTEIN SEQUENCE 

MF C422 H685 N105 0135 S7 

CI MAN 

SR CA 

LC STN Files: CA 

DES 5 : ALL , L 



*** STRUCTURE DIAGRAM IS NOT AVAILABLE *** 
III USE SQD OR 1 SQIDE 1 FORMATS TO DISPLAY SEQUENCE *** 
4 REFERENCES IN FILE CA (1967 TO DATE) 



REFERENCE 1 

REFERENCE 2 

REFERENCE 3 

REFERENCE 4 



P 120:75451 
P 119:93527 

118 :122542 
P 115:205920 



L2 ANSWER 6 OF 13 REGISTRY COPYRIGHT 1995 ACS 

S Glycoprotein ^ (Felis catus chain 1 isoform A protein moiety 

reduced) (9CI) (CA INDEX NAME) 
S H "r|en Pol dl (Fells catus chain 1 isofo™ A precursor protein 

CN Leader HSSS't celi-reactive feline protein chain 1 (cat, 

FS PROTEIN SEQUENCE 

MF C461 H748 N116 0134 S6 

CI MAN 

SR CA 

LC STN Files: CA 

DES 5 : ALL , L 

*** STRUCTURE DIAGRAM IS NOT AVAILABLE *** 

*** USE 'SQD 1 OR 'SQIDE' FORMATS TO DISPLAY SEQUENCE *** 
USE SQD UK re £ erences IN piLE CA (1967 T0 DATE) 

REFERENCE 1: P 12 0:754 51 
REFERENCE 2: P 119:93527 
REFERENCE 3: 118:122542 
REFERENCE 4: P 115:205920 

L2 ANSWER 7 OF 13 REGISTRY COPYRIGHT 1995 ACS 

S ^3-l2 6 G?ycoproteJn T TRFP (Felis catus chain 1 isoform A protein 

moiety reduced) (9CI) (CA INDEX NAME) 
OTHER NAMES: 



Cunningham 08/300,510 Page 4 

CN Allergen Fel dl (Felis catus chain 1 protein moiety reduced) 

FS PROTEIN SEQUENCE 

MF C348 H565 N87 0111 S4 

CI MAN 

SR CA 

LC STN Files: CA 
DES 5 : ALL , L 

*** STRUCTURE DIAGRAM IS NOT AVAILABLE *** 
*** USE 'SQD' OR ' SQIDE ' FORMATS TO DISPLAY SEQUENCE 
2 REFERENCES IN FILE CA (1967 TO DATE) 

REFERENCE 1: 118:122542 
REFERENCE 2: P 115:205920 



** * 



L2 ANSWER 8 OF 13 REGISTRY COPYRIGHT 1995 ACS 

CN L-V 4 ann2 4 -LaW^W-L-lyByl-L-arginyl-L- .alp^. -aap^yl-L- 
° valyJ-S alpha. ispartyl-L-leucyl-L-phenylalanyl-L-leucyl-L- 

^LLnylglycyl-L-threonyl-L-prolyl-L- .alpha -aspartyl-L- .alpha. - 
alutamyl-L-tyrosyl-L-valyl-L- .alpha. -glutamyl-L-glutaminyl-L-valyl-L- 

SanyT-Lglu^ 

(9CI) (CA INDEX NAME) 
FS PROTEIN SEQUENCE 
MF C149 H234 N36 046 



SR CA 

LC STN Files: CA 
DES 5 : ALL , L 
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PAGE 1-A 



OH 





CH 2 0 CH 2 -CH2-C0 2 H O 

CH- NH- C- CH- NH C~ CH~ NH- C~ 

I II I 

C =0 0 CH2-CO2H 

I 

NH 

I 

CH— Pr-i 

I 

c=o 

I 

NH 



N 



OH . 



C- CH- CH- Me OH 

0 NH- C- CH2 - NH- C- CH- CH- Me 

O 0 NH- C 

II 
O 



CH- CH2- CH2~ CO2H 
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PAGE 1-B 



— CH— Bu-i 

NH- C- CH- CH2- Ph 

II I 

0 NH-C-CH-Bu-i 

II 
0 
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Page 7 



PAGE 2 -A 



C=0 

r °\ 

CH- CH 2 ~ CH 2 - C- NH 2 

C=0 

I 

NH 

I 

CH- Pr-i 

C=0 

I 

NH 

CH— Me 

I 

c=o 



PAGE 2-B 

I 

NH- C- CH- CH 2 - C0 2 H 

O NH- C- CH- Pr-i 

I' 1 

O NH- C- CH- CH 2 - C0 2 H NH 

II I II 

O NH- C- CH- ( CH 2 ) 3 _ NH- C~ NH 2 

O NH-C-CH- (CH 2 )4-NH 2 

II I 

O NH-C-CH— Pr-i 
II I 

O NH— C— CH— Me 
0 NH 2 
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PAGE 3 -A 



NH 

I 

CH- 

I 

C= 

I 

NH 

CH- 

I 

C= 

NH 

I 

CH~ 

C= 

I 

NH 

CH- 



O 

II 

CH 2 - CH2- C- NH 2 
O 




(CH2)4~NH2 

: 0 

Me 



PAGE 4 -A 



C=0 
I 

NH 

I 

CH- Bu-i 

I 

C=C 



O C0 2 H 
N - C-NH-CH-Pr-i 



1 REFERENCES IN FILE CA (1967 TO DATE) 
REFERENCE 1: P 115:205920 
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L2 ANSWER 9 OF 13 REGISTRY COPYRIGHT 1995 ACS 
RN 136380-73-9 REGISTRY 

CN L-Glutamic acid, L-tyrosyl-L-lysyl-L-alanyl-L-leucyl-L-prolyl-L- 

valyl-L-valyl-L-leucyl-L- .alpha. -glutamyl -L-asparaginyl-L- alanyl-L- 
arqinyl-L-isoleucyl-L-leucyl-L-lysyl-L-asparaginyl-L-cysteinyl-L- 
valyl-L-. alpha. -aspartyl-L-alanyl-L-lysyl-L-methionyl-L-threonyl-L- 
. alpha. -glutamyl -L- .alpha. -glutamyl -L- .alpha. -aspartyl-L-lysyl- 

(9CI) (CA INDEX NAME) 
FS PROTEIN SEQUENCE 
MF C140 H235 N37 045 S2 
SR CA 

LC STN Files: CA 
DES 5 : ALL , L 



PAGE 1-A 



OH 
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PAGE 1-B 



O 



0 



CH2~ SH 



0 



O Me O C— 

0 C- NH- CH- C- NH- CH" 

II I 

C- NH- CH- CH2- CO2H 
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PAGE 1-C 



0 



0 C0 2 H 
0 C- NH- CH- CH 2 - CH 2 - C0 2 H 

O C- NH- CH- (CH 2 ) 4- NH 2 

II I 
0 C-NH-CH-CH 2 -C0 2 H 

II I 

O C-NH-CH-CH 2 -CH 2 -C0 2 H 

C- NH- CH- CH 2 - CH 2 - C0 2 H 
OH 



C— NH— CH- CH— Me 
— NH- CH- CH 2 - CH 2 - SMe 
(CH 2 ) 4 ~NH 2 
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PAGE 2 -A 



CH 2 

CH- NH2 

C— 0 

NH 

I 

CH- (CH 2 )4-NH2 

C==0 
I 

NH 

I 

CH- Me 

C=0 

I 

NH 

CH- Bu-i 



O 



0 



Me 0 



0 

II 

C- NH- 



C- NH- CH- C- NH- CH" 

O 

II 

O C- NH- CH- CH2 - C- NH2 

II I 

C- NH- CH- CH 2 - CH 2 - C0 2 H 



PAGE 2-B 



0 



C- NH- CH- C- NH- CH- Pr- i 
0 



O C- NH- CH- CH2 - C- NH2 

II I 
O C- NH- CH- (CH 2 ) 4 ~ NH 2 

II I 

C— NH— CH— Bu-i 
Me 

I 

- CH- CH— Et 



— (CH 2 )3-NH-C-NH2 
NH 
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Page 



PAGE 3 -A 



0=C 



O C-NH— CH— Bu-i 

II I 
0 C-NH-CH-Pr-i 

II I 

C— NH— CH— Pr-i 



1 REFERENCES IN FILE CA (1967 TO DATE) 



REFERENCE 



P 115:205920 



L2 
RN 
CN 



COPYRIGHT 1995 ACS 



FS 
MF 
SR 
LC 
DES 



ANSWER 10 OF 13 REGISTRY 
136380-72-8 REGISTRY 

L-Glutamic acid, L-alanyl-L-glutaminyl-L-tyrosyl-L-lysyl-L-alanyl-L 
leucyl-L-prolyl-L-valyl-L-valyl-L-leucyl-L- .alpha -glutamyl-L- 
asparaginyl-L-alanyl-L-arginyl-L-isoleucyl-L-leucyl-L-lysyl-L- 

JspLaginyl-L-cysteinyl-L-valyl-L-. alpha -aspartyl-L-alanyl-L-lysyl 
L-methionyl-L-threonyl-L- .alpha. -glutamyl -L- .alpha. -glutamyl -L- 

. alpha. -aspartyl-L-lysyl- (9CI) (CA INDEX NAME) 
PROTEIN SEQUENCE 
C148 H248 N40 048 S2 
CA 

STN Files: CA 
5 : ALL , L 



Cunningham 08/300,510 Page 



PAGE 1-A 
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Page 



PAGE 1-B 



O 



O 



Me O 



CH2- SH 



C- NH— CH— C- NH— CH- 



O 
II 
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Page 16 



PAGE 1-C 



0 C0 2 H 

O C-NH-CH-CH2-CH 2 -C0 2 H 
II I 

O C- NH- CH— (CH 2 ) 4~ NH 2 

II I 

0 C-NH-CH-CH 2 -C0 2 H 
II I 

0 C- NH- CH- CH 2 - CH 2 - C0 2 H 

C- NH- CH- CH 2 - CH 2 - C0 2 H 
OH 



C- NH- CH- CH- Me 

I 

— NH- CH- CH 2 - CH 2 - SMe 
(CH 2 ) 4 -NH 2 
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PAGE 2 -A 



0 NH2 

CH 2 O NH-C-CH-Me 0 

I III II 

CH- NH- C- CH- CH 2 - CH2~ C~ NH 2 

I 

NH 

I 

CH- (CH2) 4 _ NH2 

C=0 

NH 

CH— Me 
I 



o 



0 



Me 0 



O 

C- NH" 



C- NH- CH- C- NH- CH" 

O 

II 

C- NH- CH- CH 2 - C- NH2 



NH 
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PAGE 2-B 



O 0 C-NH-CH-CH2-CO2H 

I! II I 

C- NH- CH- C- NH- CH- Pr-i 
O 



0 C- NH- CH- CH2 - C-^ NH2 

C- NH- CH- ( CH2 ) 4 _ NH2 



C-NH— CH-Bu-i 
Me 



CH- CH- Et 

(CH 2 ) 3-NH-C- NH 2 
NH 



PAGE 3 -A 



CH- Bu-i 
I 

:C 

,N 




REFERENCE 



0 C- NH- CH- CH 2 - CH 2 - CO2H 

O C-NH— CH- Bu-i 

II I 

O C-NH-CH— Pr-i 

II I 

C— NH— CH— Pr-i 

1 REFERENCES IN FILE CA (1967 TO DATE) 
1: P 115:205920 



L2 
RN 
CN 



ANSWER 11 OF 13 REGISTRY COPYRIGHT 1995 ACS 

L 3 Giu2;^\cid G ^ 

L-leucyl-L-. alpha. -glutamyl-L-asparaginyl-L-alanyl-L-arginyl-L- 
isoleuiyl-L-leucyl-L-lysyl-L-asparaginyl-L-cysteinyl-Ljalyl-L- 

.alpha. -aspartyl-L-alanyl-L-lysyl-^ 

glutamyl-L-. alpha. -glutamyl-L- . alpha. -aspartyl-L-lysyl- (9CI) (CA 
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INDEX NAME) 

OTHER NAMES: . , , , _ _ . N 

CN Human T cell-reactive feline protein chain 1 (29-55) (cat synthetic) 

FS PROTEIN SEQUENCE 

MF C131 H226 N36 043 S2 

SR CA 

LC STN Files: CA 

DES 5 : ALL , L 



PAGE 1-A 



MeS- CH2- CH 2 

H 2 N— 



Cunningham 08/300,510 



Page 2 0 



PAGE 1-B 



0 CO2H 

O C- NH — CH- CH 2 - CH2 - CO 2 H 

II I 

0 C-NH-CH- (CH2)4 _ NH2 

II I 

O C-NH-CH-CH2-CO2H 

O C- NH- CH- CH 2 - CH2- C0 2 H 

0 C- NH- CH- CH2- CH 2 - C0 2 H 
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REFERENCE 2: P 120:75451 
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CN L-Valine, L-valyl-L-lysyl-L-arginyl-L- .alpha. -aspartyl-L-valyl-L- 

. alpha. -aspartyl-L-leucyl-L-phenylalanyl-L-leucyl-L-threonylglycyl-L- 
threonyl-L-prolyl-L- .alpha. -aspartyl-L- .alpha. -glutamyl -L-tyrosyl-L- 
valyl-L- .alpha. -glutamyl -L-glutaminyl-L-valyl-L-alanyl-L-glutaminyl- 
L-tyrosyl-L-lysyl-L-alanyl-L-leucyl-L-prolyl- (9CI) (CA INDEX NAME) 
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L3 ANSWER 1 OF 5 HCA COPYRIGHT 1995 ACS 

AN 121:228217 HCA . 

TI Potential therapeutic recombinant proteins comprised of peptides 

containing recombined T cell epitopes 
AU Roqers, Bruce L. ; Bond, Julian F. ; Craig, Sandra J.; Nault, 

Anneliese K. ; Segal, Debra B.; Morgenstern, Jay P.; Chen, Meei-Song; 

Bizinkauskas, Christine B.; Counsell, Catherine M. ; et al . 
CS ImmuLogic Pharm. Corp., Waltham, MA, 02154, USA 
SO Mol. Immunol. (1994), 31(13), 955-66 

CODEN: MOIMD5; ISSN: 0161-5890 
DT Journal 

AB The^omplete primary structure of Fel dl has been detd and shown to 
be comprised of two sep . polypeptide chains (designated chain 1 and 
chain 2) . Overlapping peptides covering the entire sequence of both 
chains of Fel d I have been used to map the major areas of human T 
cell reactivity. The present study describes three non- contiguous T 
cell reactive regions of <30 aa in length that were assembled in all 
six possible configurations using PCR and recombinant DNA methods. 
These six recombinant proteins comprised of defined non-contiguous T 
cell epitope regions artificially combined into single polypeptide 
chains have been expressed in E. coli, highly purified and examd 
for their ability to bind to human cat-allergic IgE and for human T 
cell reactivity. Several of these recombined T cell epitope-contg. 
polypeptides exhibit markedly reduced IgE binding as compared to the 
native Fel dl . Importantly, the human T cell reactivity to 
individual T cell epitope-contg. regions is maintained even though 
each was placed in an unnatural position as compared to the native 
mol In addn., T cell responses to potential junctional epitopes 
were not detected. It was also demonstrated in mice that s.c. 
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injection of T cell epitope-contg . polypeptides inhibits the T cell 
response to the individual peptides upon subsequence challenge in 
vitro Thus, these recombined T cell epitope-contg. polypeptides, 
which harbor multiple T cell reactive regions but have significantly 
reduced reactivity with allergic human IgE, constitute a novel 
potential approach for desensitization to important allergens. 
IT 136380-56-8 136380-69-3 . 

(potential therapeutic recombinant proteins comprised of peptides 
contg. recombined T cell epitopes from allergens) 

L3 ANSWER 2 OF 5 HCA COPYRIGHT 1995 ACS 
AN 120:75451 HCA 

TI Peptides useful for inducing immune tolerance 

IN Gefter, Malcolm L. ; Garman, Richard D. ; Greenstein, Julia L. ; Kuo, 

Mei Chang; Briner, Thomas J.; Morville, Malcolm 
PA Immunologic Pharmaceutical Corp., USA 
SO PCT Int. Appl., 107 pp. 

CODEN: PIXXD2 
PI WO 9319178 A2 930930 

nq W- AU CA. FI, HU, JP, KP, KR, NO, NZ, PT 

RW: At] B E ; CH, DE, DK, ES, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE 
AI WO 93-US2462 930325 
PRAI US 92-857311 920325 

US 92-884718 920515 

US 93-6116 930115 
DT Patent 

LA English . -, i~ 

AB A compn. contg. .gtoreq.l peptide derived from a human 

T-cell-reactive feline protein, a protein antigen, an allergen, or 
an autoantigen is used for treating a disease which involves an 
immune response to the feline protein, protein antigen, allergen, or 
autoantigen. The peptides comprise a sufficient percentage of the T 
cell epitopes of an antigen (from Felis, Ambrosia, etc .), allergen 
(Der p I Der f I, etc.) or autoantigen (insulin, myelin basic 
protein, 'etc. ) . Thus, s.c. administration of a combination of 
Lys-Arg-Asp-Val-Asp-Leu-Phe-Leu-Thr-Gly-Thr-Pro-Asp-Glu-Tyr-Val-Glu- 

Gln-Val-Ala-Gln-Tyr-Lys-Ala-Leu-Pro-Val and Lys-Ala-Leu-Pro-Val-Val- 
Leu-Glu-Asn-Ala-Arg-Ile-Leu-Lys-Asn-Cys-Val-Asp-Ala-Lys-Met-Thr-Glu- 

Glu-Asp-Lys-Glu (peptides derived from T-cell-reactive feline 
protein) induced T cell tolerance in mice. 
IT 136380-56-8 136380-69-3 

(T-cell tolerance induction with) 
IT 136380-56-8D, Human T-cell-reactive feline protein-deriyed 
peptide mixts. 136380-69-3D, Human T-cell-reactive feline 
protein-derived peptide mixts. 

(for T-cell tolerance induction) 
IT 136796-96-8, Glycoprotein TRFP (Felis catus chain 1 isoform 
A protein moiety reduced) 136796-97-9, Glycoprotein TRFP 
(Felis catus chain 1 isoform B protein moiety reduced) 
(peptide derived from, for T-cell tolerance induction) 
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L3 ANSWER 3 OF 5 HCA COPYRIGHT 1995 ACS 
AN 119:93527 HCA 

TI Recombitope peptides containing T cell epitopes and stimulating T 

cell activity, for allergy therapy and diagnosis 
IN Rogers, Bruce L . ; Morgenstern, Jay P.; Bond, Julian F . ; Garman, 

Richard D.; Kuo, Mei Chang; Morville, Malcolm 
PA Immulogic Pharmaceutical Corp., USA 
SO PCT Int. Appl., 73 pp. 

CODEN: PIXXD2 
PI WO 9308280 Al 930429 
DS W: AU, CA, FI , HU, JP, KR, NO 

RW: AT, BE, CH, DE, DK, ES , FR, GB , GR, IE, IT, LU, MC , NL, SE 
AI WO 92-US8694 921016 
PRAI US 91-777859 911016 

US 91-807529 911213 
DT Patent 
LA English 

AB Recombitope peptides, stimulating T cell activity and comprising 
.gtoreq.2 T cell epitopes derived from the same or from different 
protein antigens, are provided. The peptides can be derived from 
protein allergens, autoant igens , or other protein antigens. Methods 
of diagnosing sensitivity to an allergen or other protein antigen, 
methods to treat such sensitivity, methods for designing recombitope 
peptides where the protein antigen has unknown or ill-defined T cell 
epitopes, and therapeutic compns . are also disclosed. Tcell 
epitopic studies were done with peptides and protein chains of the 
human T cell -reactive feline protein (TRFP) and immunoreact ive 
regions were identified. Synthetic oligonucleotides were designed 
with Escherichia coli-pref erred codons for PCR amplification and 
expression in E. coli of recombitope peptides from TRFP. The 
peptide sequences included a 6 His residue leader sequence (for 
allowing purifn. of the expressed recombitope peptide using QIAGEN 
NTA-agarose) and a thrombin cleavage site before the actual 
recombitope sequence. Recombitope peptide arrangements were 
identified which had little to no binding to IgE and which gave 
responses to T cells of patients allergic to TRFP. 
IT 136796-96-8, Leader A-human T cell-reactive feline protein 
chain 1 (cat) 136796-97-9, Leader B - human 'T cell-reactive 
feline protein chain 1 (cat) 

(amino acid sequence of and T cell epitopes-contg . T 
cell-stimulating recombitope peptides recombinant prepn. in 
relation to) 

IT 136380-69-3, Human T cell -react ive feline protein chain 1 
(29-55) (cat synthetic) 

(recombitope peptide contg., cat allergy detection and treatment 

with) 

L3 ANSWER 4 OF 5 HCA COPYRIGHT 1995 ACS 
AN 118:122542 HCA 

TI Amino acid sequence of Fel dl, the major allergen of the domestic 
cat: protein sequence analysis and cDNA cloning 
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